Reference no: EM131404756
The data in the table for given Exercise gave the square footage and asking price for nine homes for sale in Orange County, California in February 2010. The house with a square footage of 5500 is an obvious outlier. The value of r2 for the relationship between y = asking price and x = square footage for all nine homes is 82.6%. If the outlier were to be removed, do you think the value of r2 would increase or decrease? Explain your reasoning, using the interpretation of r2 as "the proportion of variability in y explained by knowing x" as a guide.
(Hint: A scatterplot may help.)
Exercise
The data in the following table show the square footage and asking price (in thousands of dollars) for nine homes for sale in Orange County, California in February 2010. Orange County has a mixture of residential areas, including suburban neighborhoods and exclusive beachfront properties.
a. In the relationship between square footage and asking price, which is the response variable (y) and which is the explanatory (x) variable?
b. Draw a scatterplot of the data.
c. There is an obvious outlier in the data. Refer to the reasons for outliers described, and explain which one of the reasons is the mostly likely cause of the outlier in this situation.
d. If you wanted to establish a regression equation to predict asking price based on square footage for suburban residences in Orange County in February 2010, would it be legitimate to discard the outlier you identified in part (c)? Explain.
Square Footage and Asking Price for Homes in Orange country, California
|
Square Footage
|
Asking Price ($1000s)
|
2336
|
448.0
|
2485
|
500.0
|
1800
|
325.0
|
1300
|
499.0
|
2700
|
589.0
|
1881
|
745.0
|
2100
|
574.0
|
2200
|
569.0
|
5500
|
1600.0
|