The following are actual data showing the latitude of a sample of major cities in the northern hemisphere and their mean high annual temperature.
latitude(X) mean high temp(Y)
1. Acapulco 17 88
2. Algiers 37 76
3. Berlin 53 55
4. Bogota 5 66
5. Montreal 46 50
6. Oslo 60 50
7. Rome 42 71
8. Saigon* 11 90
* Now known as Ho Chi Minh City.
(a) Compute and write the prediction regression equation(note: do not just refer to spss output-write out the full equations)
(i) In unstandardized form (For every increase in ___, x goes up by ____)
(ii) In standardized form(units)
(b) Calculate the predicted mean temperature (degrees F) for each city.
(c) Test the b1 coefficient; Calculate and test R2
Report results of these tests in a short APA results-style paragraph
(d) Plot Y (vertical axis) vs. X (horizontal axis) showing best fitting straight line
(e) Report standard measures of leverage (X-space), studentized deleted residuals (Y-space), and SDFFITS (influence) for each city.
(f) Examine your plots. Consider your expectations, the obtained values of r and b1, the plots, and your outlier statistics. Do any of the cities appear to be having a particularly strong influence on the results? If so, which one(s) and why?
(g) Drop the point(s) identified in (f) from the data analysis and using SPSS re-compute r and b1. Test the new b1 coefficient; Calculate and test the new R2.Report the results of these tests in a short APA results-style paragraph, and describehow do the results change?
(h) In general, are you justified in dropping data? When is it appropriate vs. inappropriate? Comment on why this procedure of dropping one or more cities may or may not be appropriate in the present case
(Hint: consider the altitudes-not latitudes-- of the cities).