Reference no: EM132397530
Data Analyst Capstone course project
Project in 2 Parts
1) Build a machine learning model to test and do prediction.
2) A report with 1500 - 3000 words
Part 1(Predicting Mortgage Rates From Government Data)
Challenge Instructions
Build a machine learning model and test it with the Test set values dataset.
Problem Description
• About the Data
• Target Variable
o Submission Format
o Performance Metric
• Features
o Example Row
About the Data
Your goal is to predict the rate spread of mortgage applications according to the given dataset, which is adapted from the Federal Financial Institutions Examination Council's (FFIEC).
Target Variable
We're trying to predict the variable rate_spread for each row of the test data set. Your job is to:
1. Train a model using the inputs in train_values.csv and the labels train_labels.csv
2. Predict value for each row in test_values.csv for which you don't know the true value of rate_spread.
3. Output your predictions in a format that matches submission_format.csv exactly.
4. Upload your predictions to this competition in order to get a score.
PART 2 (REPORT)
In this challenge, you will create and submit a report that documents the analysis you have performed on the competition data and presents your findings and conclusions, with supporting statistics and data visualizations.
Specifically, your report must:
• Be written in English and submitted as a PDF document with a maximum file size of 5 MB and a guideline length of between 1500 and 3000 words (a few less or more is permissible provided all other requirements are met). You can save Microsoft Word 2016 documents in PDF format, and there are numerous online tools to convert documents to PDF.
• Start with an executive summary or overview section that concisely summarizes the analysis you performed during this project, and the conclusions you reached.
• Continue by describing the data, the process used to explore and analyze it, and the key findings, conclusions, and recommendations you reached.
• Support your conclusions by presenting statistics and visualizations.
Attachment:- Data Analysis.rar