Reference no: EM133408828
Assignment:
Read and understand the data carefully. What are the issues (e.g., missing values or noise) that you noticed in the dataset? Apply any cleaning method that you find fit and provide justification of your decisions. Your data cleaning should be comprehensive.
Task 1: Exploratory Data Analysis
1. Provide summary statistics for all variables. Find out the potential outliers, if any, for each variable.
2. "Create a" heat map of the correlation matrix that shows correlation coefficients among all the variables in the dataset. What are your observations?
3. Deduct some statistical results from the datasets (at least two results and discuss it in detail)
Task 2: Build Classification/Clustering/Regression model development
1. Perform the normality test for the data and graphically represent the results. Transform the data if not normally distributed.
2. Develop any two classification/clustering/Regression models based on your dataset type. Briefly describe the interpretation of each model.
3. Select one of the developed models and perform hyper-parameter tuning using best combination of model parameters. Compare the optimized model with the initial model and indicate whether the results are statistically significant?