Reference no: EM132250657
Assignment -
Readings -
a. Comparing Groups - Chapters 6 (especially section 6.4), 7, 8, and 9 (skip 9.7 and 9.9).
b. Science Isn't Broken.
c. ASA's 2016 statement on Statistical Significance and P-Values.
Tasks -
Task 1: Read the two web-pages linked above. Explain what a p-value is and how a researcher should use it. (150 words or less).
Task 2: Randomization tests
When waiting to get someone's parking space, have you ever thought that the driver you are waiting for is taking longer than necessary? Ruback and Juieng (1997) ran a simple experiment to examine that question. They observed behavior in parking lots and recorded the time that it took for a car to leave a parking place. They broke the data down on the basis of whether or not someone in another car was waiting for the space.
The data are positively skewed, because a driver can safely leave a space only so quickly, but, as we all know, they can sometimes take a very long time. But because the data are skewed, we might feel distinctly uncomfortable using a parametric t-test. So we will adopt a randomization test." -- David C. Howell
Miles: Technically this is an observational study, but the assumption of exchangeability still holds. Whether someone is waiting for the driver or not can be assumed to be random and not related to the driver itself. Thus, all permutations of the condition are equally likely, and it is valid to use a randomization test.
Conduct a randomization test to test the hypothesis that there is no difference in average time for drivers who have a person waiting vs those who do not have a person waiting, against the alternative that drivers who have a person waiting will take *longer* than if they did not.
Be sure to calculate an empirical p-value and make the appropriate conclusion.
Task 3: Randomization test for numeric data
Comparing Groups, Chapter 6, Exercise 6.1
6.1 Use the data in the AfterSchool.csv data set to examine whether there are treatment effects of the after-school program on victimization measures.
- Carry out an exploratory analysis to initially examine whether there are treatment effects of the after-school program on victimization measures.
- Use the randomization test to evaluate whether there is convincing evidence that the after-school program has an effect on victimization.
Write up the results from both sets of analyses as if you were writing a manuscript for publication in a journal in your substantive area.
Make sure you use the **Comparing Groups** textbook. Make sure you read the description of the experiment that generated the data on page 118 of the textbook.
Be sure to calculate an empirical p-value and make the appropriate conclusion.
Task 4: Comparing Groups, Chapter 7, Exercise 7.1
7.1 The National Center for Education Statistics (NCES) is mandated to collect and disseminate statistics and other data related to education in the United States. To that end, NCES began collecting data on specific areas of interest including educational, vocational, and personal development of school-aged children, following them from primary and secondary school into adulthood. The first of these studies was the National Longitudinal Study of the High School Class of 1972 (NLS-72). High School and Beyond (HS&B) was designed to build upon NLS-72 by studying high school seniors, using many of the same survey items as the 1972 study. A sample of N = 200 students from the 1980 senior class cohort from HS&B were obtained and are located in the HSB.csv data set.
Use a nonparametric bootstrap to test if there is a difference in the variances of science scores, between public and private school students. That is, test the null hypothesis, H0: σ2public = σ2private. Write up the results from the analysis as if you were writing the results section for a manuscript for publication in a journal in your substantive area.
Task 5: Bootstrap hypothesis test for the speed of light.
In 1882 Simon Newcomb tried to measure the speed of light. He measured the time it took for light to travel from Fort Myer on the west bank of the Potomac River to a fixed mirror at the foot of the Washington monument approximately 3720 meters away.
The values are recorded as deviations from 24800 nanoseconds. Thus a value of 28 means that Newcomb measured the time to travel to the mirror and back as 24828 nanoseconds.
He made 66 measurements, and the deviations are as follows:
{r} light <- c(28, 26, 33, 24, 34, -44, 27, 16, 40, -2, 29, 22, 24, 21, 25, 30, 23, 29, 31, 19, 24, 20, 36, 32, 36, 28, 25, 21, 28, 29, 37, 25, 28, 26, 30, 32, 36, 26, 30, 22, 36, 23, 27, 27, 28, 27, 31, 27, 26, 33, 26, 32, 32, 24, 39, 28, 24, 25, 32, 25, 29, 27, 28, 29, 16, 23)
The modern accepted value for the time light travels the distance used in the experiment is 24833 nanoseconds, or a deviation of 33.
Perform a bootstrap test to see if Newcomb's measurements are significantly different from the modern accepted value of 33.
- H0: mean = 33, Ha: mean != 33
- The mean of this sample of data is 26.212.
- Recenter the vector light, so it is centered at 33. This simulates a population of data that is centered at the true mean, but exhibits the same variation around the mean as Newcomb's measurements.
- Perform bootstrap resampling (you might need 10^5 repetitions) to see how often a random sample drawn from this population could produce a mean as extreme as the sample observed in our data.
Perform the bootstrap test again after removing the two negative outliers (-2, and -44) to see if the Newcomb's measurements are significantly different from the modern accepted value of 33. Don't forget to recenter the data after removing the outliers.
Textbook - Comparing Groups - Randomization and Bootstrap Methods Using R, by Andrew S. Zieffler, Jeffrey R. Harring and Jeffrey D. Long. ISBN 978-0-470-62169-1.
Attachment:- Assignment File.rar