Reference no: EM132266706
Assignment -
Some data management things to think about
- Missing data
- Do you have any?
- What are you going to do with them?
- How are they labelled (numbers, text)?
Always create new variables with a new name. Then you can delete them if they don't work.
Always give a subset data file a new name. Then you can delete it and start over if you screwed up.
When subsetting, make sure you are starting from the correct data file (original file -> subset file)
ALWAYS, ALWAYS, ALWAYS check that new variable creation, recoded variables, computed variables and subsets, etc have worked
Not seeing your new variable in Rcmdr?
- Did the variable creation work?
- Try 'refreshing' Rcmdr.
How the data are best presented?
oIf doing an ANOVA and you have a lot of groups, it may be easier to produce a table of the means/SDs and other descriptive
- Show a contingency table for Chi square
- Show a scatterplot for correlation/regression
- Show means/SDs/Ns/medians/what's relevant/etc in text or in a table for ttest/ANOVA (we don't need to see boxplots and histograms)
How do you delete variables in Rcmdr?
The answers to questions 1-3 must include
- what data you used (inclusion / exclusion) and selection of appropriate statistical measures and why,
- an appropriate presentation of the data in a graph and / or table,
- statements about assumption testing, and
- an appropriate conclusion based on a statistical test, with justification in text.
Report the p-value to 2 or 3 decimal places (p=0.02, p<0.001, p=0.008 for example) and other values to 2 decimal places. There will be penalties for going over specified word and page limits.
Analysis questions using HINTS data
These analyses require using several variables from the HINTS
5 Cycle 1 data set and assessing some relationships between these variables
In particular, you will be testing the relationships between BMI and variables measuring moderate exercise, both as continuous variables or as categorical variables. Since the variables of interest are provided as either continuous or categorical then the other variable type will need to be constructed.
Variable names are in brackets and their details may be found in the codebook.
Questions -
1. Consider body mass index BMI (BMI) as a continuous variable. We wish to relate BMI to the number of days of any physical activity or exercise of at least moderate intensity (TimesModerateExercise) as a categorical variable. Is average BMI (kg/m2) different between the categories: number of days of exercised?
2. For all adults, BMI categories are underweight is BMI < 18.5; healthy weight is BMI >= 18.5 to <25; overweight is BMI > = 25 to <30; obese is BMI > = 30. Construct a categorical variable (BMIcat) with these categories from the continuous variable BMI (BMI). Analyse the relationship between the BMI categorical variable and the number of days of moderate physical activity categorical variable (TimesModerateExercise) as specified in Question 1.
3. We wish to consider the total number of minutes moderate exercise in week as a predictor of BMI. We decide that the best way to estimate this is by multiplying the number of minutes of moderate exercise per day (HowLongModerateExerciseMn) by the number of days of moderate exercise per week (TimesModerateExercise). Create a new continuous variable and relate this to continuous (BMI) where both variables are regarded as continuous and BMI is the outcome variable. (Hint: When calculating the new variable, the number of minutes of moderate exercise, we need to be sure to be very careful with missing values and also 0 minutes of exercise to ensure these are kept in the analysis).
Finally, consider a similar analysis to that with the newly created variable but employ TimesModerateExercise as a continuous variable rather than a categ-orical variable. Is this analysis clearer than the one carried out in Question 1?
4. Write an analysis plan to examine the relationships between several variables with the participants agreement with the statement that alcohol increases your risk of cancer (AlcoholIncreaseCancer) from the HINTS5 Cycle 1 survey. These include age (SelfAge) and also as five categories (AgeGrpB), sex (GenderC) and several other variables, namely, education ( ), using the internet to look up cancer information (InternetCancerInfoSelf) and BMI category as defined in Question 2 and continuous BMI (bmi). Use the same PDF codebook as for Questions 1 to 3.
Maximum 5 pages to answer Questions 1-3.
Summary for Questions 1-3 , must write down for 200 words at the end, describe all your overall conclusion.