Reference no: EM132876667 
                                                                               
                                       
Roadmap: Two Sample T-test and Paired T-test
Learning Objectives
Distinguish between dependent and independent samples.
Be able to design an experiment that is suited to a paired t-test.
Be able to design an experiment suited to an independent-samples t-test (also called a two sample t-test).
Recognise the pro's and con's of designing an experiment suited to a paired t-test versus a two-sample t-test.
Be able to conduct a paired t-test and a two sample t-test by pen-and-paper and in R.
Be able to recognise when an effect size is small, medium or large.
Recognise that power and sample size tests can be performed on paired t-tests and the two sample t-test. (See Further Useful Information for more details).
Exercise 1
Which t-test is most appropriate - a one sample t-test, paired t-test or two sample t-test? Discuss if the test is one tailed or two tailed and formulate the null and alternate hypotheses in words and in symbol form.
A researcher wants to investigate if fuel additives improves gas mileage. To test this claim, the researcher measures the gas mileage for 16 cars both before and after the researcher combines the additive with a full tank of gas. Assume all assumptions for the relevant T procedure are satisfied.
A statistics lecturer wanted to see if the exam scores differed between male and female students. An SRS of 20 male and 20 female students were collected. Assume all assumptions for the relevant T procedure are satisfied.
The amount of lead in a type of soil, can be measured by two methods the standard method and a new method, but researchers are worried the new method is overestimating lead levels. To test these concerns a research team obtains 20 random soil samples. The research team mixes and then splits each soil sample in half so each sample can be measured by both the new method and the standard method. Assume all assumptions for the relevant T procedure are satisfied.
A pharmaceutical manufacturer does a chemical analysis to check the potency of products. The standard release potency for cephalothin crystals is 910. An assay of 16 lots gives the following potency data:
897 914 913 906 916 918 905 921 918 906 895 893 908 906 907 901
Is there significant evidence at the 5% level that the mean potency is not equal to the standard release potency? Assume all assumptions for the relevant T procedure are satisfied.
Exercise 2 (pen-and-paper)
A researcher investigated if heavy marijuana use can affect memory recall on university students. A memory test was performed on light marijuana users and heavy users. A higher score on the memory test indicates better memory recall. The following statistics were generated in the experiment:
Light marijuana users: n1=20, y1¯¯¯¯¯ = 53, s1=3.6
Heavy marijuana users: n2=15, y2¯¯¯¯¯ = 49, s1=4.5
Perform a test of significance to investigate if memory recall ability is less for heavy users than light users. State your null and alternate hypotheses in words and in symbols. Use a significance level of 0.01. (Use the conservative method to estimate the degrees of freedom).
State any assumptions required to support the validity of your conclusions.
Mathematically and with a sketch show how you would compute the P-value for this problem.
Comment on the effect size.
Assignment
Problem 1
Dataset: cabbage.csv
This dataset has 60 observations and 4 variables:
Cult - Factor giving the cultivar of the cabbage, two levels: c39 and c52.
Date - Factor specifying one of three planting dates: d16, d20 or d21.
HeadWt - Weight of the cabbage head, presumably in kg.
VitC - Ascorbic acid content, in undefined units.
Perform a hypothesis test to test if the vitamin C content in the C52 cultivar is greater than the C39 cultivar. Consider all planting dates. Use the 7(9) step procedure and set a significance level to 0.05. (Hint: in the data step you will need to ‘pull out' the Vitamin C values for the C39 cultivar and the C52 cultivar separately. If all else fails manually type these numbers into R.)
Problem 2 (pen-and-paper)
Beta endorphins are morphine like substances produced by the body. They create a sense of well-being. It has been proposed that Beta endorphins increase with exercise. Sport scientists conducted an experiment to investigate this problem. They used R to perform a hypothesis test and adopted a significance level of 0.05. The differences (dif) were computed as follows:
dif = Beta-endorphins post-test minus Beta-endorphins pre-test
The following R output was generated. Look at the R output generated. Answer each question below assuming the assumptions associated with this procedure are valid.
##
## One Sample t-test
##
## data: z$dif
## t = 6.8419, df = 8, p-value = 6.604e-05
## alternative hypothesis: true mean is greater than 0
## 95 percent confidence interval:
## 11.51382 Inf
## sample estimates:
## mean of x
## 15.81111
(2a) How many athletes took part in both the pre- and post-measurements?
(2b) Explain how the number 15.81111 was computed.
(2c) Formulate a null and alternate hypothesis in symbols and words.
(2d) What is the value of the test statistic?
(2e) Draw a picture that shows the test-statistic and critical t-value(s).
(2f) State the conclusions of the test for an expert and non-expert audience.
Attachment:- Two Sample T-test and Paired.zip