Reference no: EM133286160
Complete this exercise either using either SAS or SPSS Only
After finishing your MBA from the University of Louisville, you decide to relocate from Louisville to San Francisco, CA to work for a technology startup. First task for relocation: Buying a new house.
You decide that you would leverage the knowledge gained in your UofL MBA program to make a data-driven decision. Most buyers in Louisville looked at conventional measures, such as: size of the house, age of the house, school district, location, subdivision, number of parks, crime rate, etc. to predict the price of a house. You work with a realtor and obtain historical sales data. Unfortunately, you realize that none of the conventional measures that would be great predictors of house prices in Louisville do not work for predicting house prices in San Francisco. This is when you realize that San Francisco being a major city, does not have many parks, schools, or subdivisions. Moreover, almost all the houses are exactly the same size.
Because of challenges of living in a major city, residents favor unconventional factors such as access to convenience stores or access to train stations. You theorize that in addition to these factors, age of the house would still matter because several old houses have been torn down and re-built. Such houses would command a higher price. You painstakingly collected data for over 400 houses. The dataset presented in realestate.xlsx and contains the following variables.
- HouseAge: Age of the house in years
- TrainStationDistance: Distance of the train station from the house in miles
- NumberOfConvenienceStores: Number of convenience or grocery stores in 2 sq. mile radius of the house
- HousePrice: Price of the house in thousands of dollars
Your objectives are as follows:
- Build a regression model predicting house price. In mathematical terms, this model would look like HousePrice = HouseAge + TrainStationDistance + NumberOfConvenienceStores
- Comment on the fit of the model. Specifically, does the regression model you built show a good fit? Why/why not?
- Explain how much variance is being explained by the regression model. Why is this number not 100%? What other factors that you think could be missing?
- What is the interpretation of the coefficients for HouseAge, TrainStationDistance, and NumberOfConvenienceStores? One line each.
Attachment:- RealEstate.rar