Reference no: EM133593950 
                                                                               
                                       
Assignment
Case Study: PCA applied to Linear Regression
Objective:
Build a Linear Regression model on Boston Housing dataset, "Boston.csv" to predict median house values "MEDV" from the other variables. Then build another model after carrying out Principal Component Analysis (PCA) and compare the performance of both models to draw conclusions about the efficacy of PCA.
Tasks:
Exploratory and preparatory
1.	Explore the data set and carry out EDA
2.	Separate target and predictor variables and split the data 70% - 30% (use random_state=42)
Build Linear regression model on original data and evaluate
1.	Build a linear regression model on the train data
2.	Calculate evaluation metrics, R-square, RMSE, MAE and MAPE on train and test data.
Carrying out PCA
1.	Normalize (scale) the original data (only predictor variables)
2.	Carry out PCA and examine cumulative variance explained by PCs
3.	Select number of PCs that explain at least 85% variance
4.	Extract the chosen number of PCs and fit on scaled data (use random_state=42)
Linear regression on scaled data and comparison
1.	Construct another linear regression model - on PCA transformed data
2.	Evaluate performance and compare both models
3.	Conclusion about efficacy of PCA.