Reference no: EM132408316
Machine Learning Project
Introduction
Your project topic must include data analysis using a machine learning model in one or more of the following categories: classification, regression, Association Rule Mining, clustering and recommender system). We did not cover Association Rule Mining and Recommendation Systems in class but you are allowed to choose a project in this category, if you like. The dataset you choose for your project could be any data you are using at work, data that is involved in your everyday life, or any publicly available dataset. You must perform a complete data analysis lifecycle including data visualization and exploratory data analysis, variable selection and feature extraction, running multiple machine learning models and comparing their performances.
Please note that the scale of your project should be bigger than any of the assignments you did the class and may require you to research and learn beyond the materials covered in the class.
What you need to turn in Here are the items that you need to submit:
1. A project Report: This includes an extended report of your project and should be prepared on a word processor and should contain figures and tables that are necessary to make the report complete. Be concise in your writing and consult technical writing references as needed. Your project report should more or less follow the following structure:
a. Abstract: A one paragraph summarizing the problem, the method you used to do the analysis and the results of your experiment.
b. Problem definition and project goals: In this section, you explain the purpose of your project and the problem you are trying to solve as well as the dataset that you used for your project. How did you obtain this dataset? What features are included in the dataset and what are you planning to do with this data?
c. Related Work: Has there been any other paper/work which addressed the same problem? Ifso, include a very brief description of their dataset, method and results and cite it in your report.
d. Data Exploration and preprocessing: In this section, you should explain how you explored the data and the relationship between variables in your dataset. Use scatter plots, boxplots, or histograms, to visualize your data and detect possible correlations between different features and the outcome variable. How did you cleaned and preprocessed the data. Did you do any feature engineering? What variables did you use and why? Is there any missing values in your data and how did you deal with it? You also need to explain if you have done any feature scaling, normalization, categorical feature encoding, etc.
e. Data analysis and experimental Results: In this section you explain the Machine learning models you used to solve the problem, present the result of your data processing and explain how you achieved the project goal through this result. You must use multiple machine learning models and evaluate them to see which one works best for your problem. Explain how you tuned the hyper-parameters of each model and evaluate the results via standard evaluation measures (such as AUC, RMSE, precision, recall, etc. ( (this is covered in week 11lectures)
f. Conclusion: In this section explain any new knowledge or interesting findings you obtained from processing your data. You can also briefly mention any further research directions.
g. References: This includesthe bibliography. Please list any externalsource (including websites, books, articles) that you used and make sure to cite them within your text.
2. “Source code”. You must run your source code in R notebook and turn in both your R notebook with “rmd” extension and its html file. When you preview your notebook in r studio, it will automatically create the html file with .nb.html extension. You must submit this file as well. It is very important that you submit both of these file extensions or your submission will not be graded.
3. Presentation: You need to prepare a set of slides to present your work. The first slide should include the title of your project and your name. The next slides should include a quick overview of the presentation. The rest of the slides should explain your work and the results.