Summarize the key differences in the conclusions

Assignment Help Basic Statistics
Reference no: EM132856967

This is a subset of a data set originally published by Atkinson (1986) and republished by Hand et al's A Handbook of Small Data Sets. The original data set gives record times in 1984 for 35 hill races (data from Hand et al's A Handbook of Small Data Sets). Three quantitative variables are recorded: Distance (in miles), HeightGain (in feet; this is a measure of total vertical climb over the course of the race), and Time (in minutes; seconds have been converted to decimal minutes). The subset you will be working with has had five observations deleted from the original data set.

(a) Fit two multiple regression models for predicting Time from the two predictor variables, Distance and HeightGain, that differ only in the order the predictor variables enter the model. Obtain key results with the summary() and anova() commands. What conclusions can you make as a result of these analyses?(You do not need to formally state the null and alternative hypotheses for each test as long as you are clear in your conclusions what you have tested. Be sure to include the appropriate test statistics and their associated p-values.)

(b) The results in (a) are distorted by the presence of two extreme outliers. Obtain a box plot of the residuals from one of your models fit in (a). Identify the races that produced the extreme outliers. Why are these outliers so extreme?

(c) In (b), what are the values of the "outer fences" that would classify an outlier as an extreme outlier? Do you think it is a coincidence that both extreme outliers are for positive residuals? Briefly explain your reasoning.

(d) Repeat the analyses in (a) after removing the two extreme outliers identified in (b). How does a boxplot of the residuals look now? What conclusions can you make now as a result of these analyses?

(e) Summarize the key differences in the conclusions you reached in (a) and (d). Include a comment on the effect of including vs excluding the two extreme outliers on the MSE and the coefficient of determination.

(f) It makes sense that longer races will take longer to run, and that the amount of vertical climb over the course of the race might influence the time it takes to finish race. But what are the relative effects of the two predictor variables on the pace the race is run, where Pace = Time/Distance? Repeat your analyses in part (d) but with Pace as the response variable. Summarize your conclusions as concisely yet completely as you can.

(g) According to the website https://www.bennevisrace.co.uk/, the record time for the Ben Nevis race, which was set in 1984 by the Scottish hill racing legend KennyStuart, still stands. This record is included in the data set we have been analysing (but there is a one second discrepancy between our data set and the website time). What is the predicted record Time and Pace for the Ben Nevis race, based on the multiple regression analyses you have run (extreme outliers excluded)?

(h) What additional factors (other than Distance and HeightGain) could contribute to variability in the Time and Pace response variables? You should be able to come up with a few good ones, give us exactly three factors as your solution to this question. Factors which can be measured readily are worth more than those that would be very hard to measure.

(mikedewin)

Reference no: EM132856967

Questions Cloud

What will be the effect on wall profit next year : If the Consumer Division is eliminated, $1,700,000 of the above fixed expenses could be avoided. What will be the effect on Wall's profit next year
What is the probability that more than 4 students : What is the probability that more than 4 students will have their automobiles stolen during the current semester? Round your answer to four decimal places.
What is the effective interest rate : What is the effective interest rate if the firm needs $163,000 to finance some expenses? The company plans on repaying the loan in a lump sum
What is the net book value of property : What is the net book value of property, plant and equipment at the end of fiscal year 2013 - Coca-Cola sold some property, plant and equipment during fiscal
Summarize the key differences in the conclusions : This is a subset of a data set originally published by Atkinson (1986) and republished by Hand et al's A Handbook of Small Data Sets.
What is the probability that a sum of 8 will occur : 1. What is the probability that a sum of 8 will occur on the fifth trial if a pair of dice is rolled?
National symposium on catch and release fishing : High in the Rocky Mountains, a biology research team has drained a lake to get rid of all fish. After the lake was refilled, they stocked it with an endangered
What is the probability that the tpms will trigger a warning : If the car's average tire pressure is on target, what is the probability that the TPMS will trigger a warning? (Round your answer to 4 decimal places.)
Promote professional success and satisfaction : Additionally, answer the question: how can being agile promote professional success and satisfaction?

Reviews

Write a Review

Basic Statistics Questions & Answers

  Effect of two breathing frequencies on performance times

Researchers from the United Kingdom studied the effect of two breathing frequencies on performance times and on several physiological parameters

  Provide a correct interpretation of test

We used an alpha of 0.05, and found a p-value of 0.479. Provide a correct interpretation of this test.

  Different methods of teaching reading

In the READING data set, described in the Data Appendix, the response variable Post 3 is to be compared for three methods of teaching reading.

  Investors use certainty equivalent coefficient

Investors use certainty equivalent coefficient in order to reduce the risk of uncertainties.

  How large a sample is required to produce an interval

Suppose you have a confidence interval based on a sample of size n.- Using the same level of confidence, how large a sample is required to produce an interval of one-half the width?

  Convergent-correlated with the self-concept

Content-based on a three-level hierarchical model of self-esteem. Convergent-correlated with the Self-Concept and Motivation Inventory r = .25

  Researcher divides her sampling frame

In a survey of Athens county voters, a researcher divides her sampling frame into age brackets.

  Convert the formula to a formula without using previous term

Write the sequence as a recursively-defined sequence. Convert the formula to a formula without using previous terms.

  Data to test the difference between two weighing methods

data to test the difference between two weighing methods for wheat deliveries gave a mean difference of 10 kg with a

  What are the null hypothesis and the alternative hypothesis

What are the null hypothesis and the alternative hypothesis? Test your Hypothesis using a Z-test. Show your calculations.

  Notice of the problems within the family

Engaging compassionately means that, the parent must take notice of the problems within the family and pay attention to the distresses of each member while show

  What is the probability of producing

Daily output of Marathon's Garyville, Lousiana, refinery is normally distributed with a mean of 232,000 barrels of crude oil per day with a standard deviation.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd