Reference no: EM131077
Question 1
The manager of a cosmetics company was interested in New Zealanders' personal hygiene. A survey was conducted by randomly selecting 5 shopping malls from around the country. At each mall a booth was set up and two interviewers (one male and one female) were stationed there. During the day, the interviewers were instructed to approach every 10th adult that passed the booth and ask them to be interviewed. About 28% of the people approached agreed to be interviewed. The interview lasted about 5 minutes and included questions such as "How often do you shower each week?" and "Do you use deodorants?". In total, 586 people were interviewed.
(a) Describe the population of interest for the survey.
(b) Give two reasons why selection bias may be a potential problem with the survey.
(c) Explain why self-selection bias is not a potential problem with the survey.
(d) Is nonresponse bias a potential problem with the survey? Briefly justify your answer.
(e) What two other nonsampling errors (apart from selection bias and non-response bias) are likely to have the greatest effect on the results from this survey?
Question 2.
A wildlife biologist was interested in whether raising deer in captivity has an effect on the size of the deer. She took a random sample of one-year old deer that had been raised in the wild and another sample of one-year old deer of the same breed that had been raised on a deer farm and obtained the animals weights.
The data are stored in the file "Deer.csv" which can be downloaded from Cecil. The data contains 2 variables:
Weight The weight of the deer (in kilograms)
Environment The environment the deer was raised in (Wild or Farm)
Run the iNZightVIT software and load the file Deer.csv into it.
(a) (i) Generate a bootstrap confidence interval for the mean weight of one-year old deer.
(ii) What is the parameter we are estimating using this bootstrap confidence interval?
(iii) Do we know the true value of this parameter?
(iv) Interpret the bootstrap confidence interval.
(v) Briefly explain why students doing this assignment will not all get the same bootstrap confidence interval.
(b) (i) Generate a bootstrap confidence interval for the difference between the mean weight of one-year old deer that have been raised in the wild and one-year old deer that have been raised on a farm. Include the output in your assignment answers.
(ii) What is the parameter we are estimating using this bootstrap confidence interval?
(iii) Interpret the bootstrap confidence interval.
(iv) Based on the bootstrap confidence interval, is it believable that the mean weight of one-year old deer that have been raised in the wild is the same as that of one-year old deer that have been raised on a deer farm?
Question 3.
Study 1: A study reported in the December 1998 British Medical Journal investigated the validity of Benjamin Franklin's maxim "early to bed and early to rise makes a man healthy, wealthy and wise". The study analysed data on men and women aged 65 and over for whom data on sleeping patterns, health, socioeconomic circumstances and cognitive abilities had been collected as part of a survey funded by the U.K. Department of Health and Social Security. 356 people were defined as "larks" (going to bed before 11pm and rising before 8am) and 318 people were defined as "owls" (going to bed after 11pm and rising after 8am).
Study 2: A study was conducted using a sample of 200 people in full time employment. Each person was given an IQ test and, based on this, they were classified as high, medium or low IQ. The annual income of each person was also recorded. The study found that people
in higher IQ groups generally earn more money.
Study 3: The manager of a plant that produces frozen meals wanted to compare two different recipes for making lasagne. 200 volunteers were recruited for a study. As it was known that there could be moderate differences in taste preferences for different racial groups, the volunteers were split into three groups: Maori/Polynesian, Asian and European. Each group was then randomly split into two, with half getting a lasagne made using the first recipe and the rest getting a lasagne made with the second recipe. After eating the lasagne, each volunteer was then interviewed and a taste score was recorded based on his or her answers to a series of questions.
(a) Answer the following questions for each study:
(i) Identify the groups that are being compared. (I.e. what treatments or factors of interest are being compared?) Do NOT also say what is being used to measure the comparisons - you do this in (ii).
(ii) What is being measured to compare these groups?
(b) Which of the studies would be described as experiments and which would be described as observational studies? (You do not need to justify these answers.)
(c) For the studies that are observational, could an experiment have been easily carried out instead? If so, briefly explain how. If not, briefly explain why not.
(d) One concern with study 1 was that "the direction of cause and effect between wealth and sleeping pattern is not certain".
(i) Explain what is meant by "the direction of cause and effect".
(ii) Why can't the study establish the direction of cause and effect?
(e) Ignoring any concerns from (d), can we extend conclusions from study 1 to the population in general? Briefly justify your answer.
(f) Which of the studies, if any, used blocking? For any studies in which blocking was used, describe what was blocked.
(g) In which of the studies, if any, should blinding have been used? For any of these studies, describe to what extent blinding should be used.
Question 4.
A group of students was interested in testing the claim that you will get more juice from citrus fruit if you rolled the fruit on a hard surface for 10 seconds before juicing it by hand. They had access to a large number of the same type of lemons from several different trees. They randomly selected 40 of these lemons then randomly divided them into two groups - one group was juiced by hand, while the other group were rolled on a hard surface before being juiced by hand.
The data for this comparison is stored in the file "Lemon Roll.csv" which can be downloaded from Cecil. The data contains 2 variables: Volume The volume of juice obtained from the lemon (in ml) Method The method used to juice the orange (either Hand or Rolled). Rolledmeans the lemons were rolled and then hand squeezed.(a) Briefly explain why this study is an experiment.
(b) (i) Run the iNZightVIT software and load the file Lemon Roll.csv into it.
Run a randomisation test to compare the median volume of juice obtained by the two methods. Include the output from this in your assignment answers.
(ii) When chance is acting alone, estimate the percentage of time we get a difference between the two sample medians at least as big as the observed difference.
(iii) Is it plausible that the observed difference can be explained by "chance acting alone"?
(iv) Can we conclude that rolling lemons on a hard surface before juicing them results in an increase in the median volume of juice produced? If so, justify why with two reasons. If not, what can we conclude?