Reference no: EM132871806
STAT3175 Linear Models - Macquarie University
Instructions
ˆ Complete Assignment 2, using the data sleepdata assn2 2021.csv.
ˆ For all hypothesis tests you should state the null and alternative hypotheses, test statistic, distribution of the test statistic under H0, p-value and conclusion.
Question 1
This question concerns a study conducted on n = 133 patients thought to have the condition Obstructive Sleep Apnea (OSA). These patients have undergone a sleep study at a Canadian sleep clinic.
"Obstructive sleep apnea ... is a breathing disorder characterized by brief interruptions of breathing during sleep. OSA occurs when air cannot flow into or out of the person's nose or mouth although efforts to breathe continue. In OSA, the throat collapses during sleep causing the individual to snort and gasp for breath. Hundreds of these episodes can occur every night."
The purpose of this analysis is to identify the risk factors for OSA. OSA is characterised by the patient frequently waking during the night, measured by the arousal index (number of arousals from sleep per hour of sleep). Potential risk factors for the disease are thought to be obesity, male gender, older age, hypertension (high blood pressure) and consumption of stimulants such as alcohol and caffeine. The variables available for analysis are:
Variable Description
age in years
gender 1=male; 2=female
alcohol usage 0=no; 1=yes
BMI body mass index
neck size in cm
SBP systolic blood pressure (mm Hg)
AI arousal index (arousals per hour of sleep)
The data are in the file sleepdata assn2 2021.csv. We will be constructing a regression model with AI (or some transformation thereof) as response variable.
a. Construct histograms of each of the continuous variables, and comment on their shapes.
b. Using either SPSS or ARC, determine which transformation(s), if any, are needed in order to perform the regression. Also decide what to do about any obviously incorrect observations.
c. Construct a scatterplot matrix of the continuous variables (transformed if necessary), and comment on the relationships between the variables.
d. For each of the categorical covariates,
i. examine frequency tables;
ii. examine boxplots of the response against the categorical covariate, and comment on the relationships observed.
e. Construct a statistical model for AI (or a transformed version of it). Write down your final model equation.
f. Perform diagnostic checking, and go back to (e) if necessary.
g. Interpret each of the model coefficients.
h. Write up your results based on this model, in a paragraph suitable to be presented to a clinician. Explain how AI is associated with subjects' characteristics. (This is in addition to your write-up of (a) to (g) above.) Statistical terms and mathematical symbols should be avoided. This paragraph should be at most 100 words.
Question 2
Consider again the data set in Assignment 1. Two other variables given are LeagueIndex and APM (actions per minute). LeagueIndex gives the level of the game coded in increasing order of difficulty as
LeagueIndex Level
1 Bronze
2 Silver
3 Gold
4 Platinum
5 Diamond
6 Master
7 GrandMaster
8 Professional leagues
a. Examine the frequencies of LeagueIndex, and if necessary perform a suitable recoding for inclusion as a covariate. Give the frequencies of your recoded variable.
b. Examine the distribution of APM, and decide on a suitable transformation if necessary.
c. Consider the inclusion of LeagueIndex and APM into the model for log(WorkersMade). You should also consider inclusion of the covariates included in Assignment 1, and inter- action terms. (It is not necessary to perform diagnostic checking.) Write down your final model.
d. Write down the fitted model equations for (i) League Index 2 (Silver), and (ii) League Index 5 (Diamond). Interpret the model coefficients.
e. Compare your final model with the model arrived at in Assignment 1. As a consequence of new information brought into the regression, what information has become redundant?
Attachment:- Linear Models.rar