Reference no: EM133129811
Question: This project is an opportunity to explore ideas that you see in the lectures, assignments, and other resources. You can think of your project as the first step towards doing research in machine learning. You will analyze the problem, design a machine learning solution, implement learning algorithms, and evaluate them on two data sets (one for classification and one for regression). Please review the full list of evaluation metrics. In this project, you are asked to (1) develop models that attain reasonable accuracy and (2) explain the performance of the trained models. For example, let's say you designed a classifier with 99% accuracy. An important task is to investigate whether the studied data set is too simple, or the evaluation metric is not appropriately capturing the classifier performance (e.g., we discussed precision, recall, and ROC curves for classification problems). Note that we should always use training, validation, and test data sets (or cross-validation) to properly evaluate the performance of machine learning algorithms to avoid underfitting and overfitting.
For each data set, you should use three distinct machine learning algorithms. For example, for solving a regression problem, you may use polynomial regression, regularized regression, and support vector regression. In this case, you should discuss how you tuned hyperparameters with some evidence. For developing classifiers, we covered a variety of classification algorithms, such as logistic regression, support vector machines, decision trees, random forest, and neural network models. Also, you should think of utilizing preprocessing techniques, such as centering and scaling, to improve the performance of learning algorithms (e.g., you can pipeline in scikit-learn).
The final report should be five pages. The report should be structured like a small research paper. Broadly speaking it should describe:
- What are the important ideas/methods you explored?
- Preprocessing techniques?
- Reporting the results (easy-to-read figures).
- Do the results make sense? Underfitting? Overfitting?
- Explain the behavior of models (e.g., does the model outperform a random classifier?)
- Please include the complete execution code to produce the reported results at the end of your report. (No page limit)
You will be assessed on the effort level, the clarity of explanations, the evidence that you present to support your claims, and the performance of machine learning methods.
Attachment:- Machine learning project.rar