Reference no: EM133330131
For this assignment you will use the NHANES17 dataset (you may have renamed this dataset).
You will use the following new variables for this homework assignment: hsq500 hsq510 hsq520, in addition to variables described below. All categorical variables should have value labels. Exclude refused, not ascertained, and don't know responses for all variables.
Please refer to the documentation: Demographics andHealth conditions.
Question 1) Create a macro (called condition) with these 3 variables in the health conditions dataset.
HSQ500 - SP have head cold or chest cold
HSQ510 - SP have stomach or intestinal illness?
HSQ520 - SP have flu, pneumonia, ear infection?
Use this macro to replace the 7 (refused) and 9 (don't know) values with missing (.) values. Also change the No values from 2 to 0. Add value labels for 1 (yes) and 0 (no). Create frequency tables for the 3 variables (1 variable per table)
Paste your code and frequency tables
Question 2) Use the variables that you recoded in #3 to create a new variable (condition_count) with the categories below. Add value labels and a variable label.(HINT: if #1 is correct, you should be able to sum the values for the 3 variables).
0: has none of these conditions
1: has 1 of these conditions
2: has 2 of these conditions
3: has 3 of these conditions
Create a frequency table for the new variable
Question 3) Run a chi square test and Fisher's exact test with gender (in the rows) and condition_count in the columns. Limit the analysis to people between 45 and 65 years of age.
Question 4) Conduct a t-test comparing the mean age between people who had zero conditions (condition_count=0) and people who had three conditions (condition_count=3).
Question 5) Run an ANOVA with age (ridageyr) as the continuous variable and the variable that you created in #2(condition_count) as the categorical variable. Include the means for each group in your analysis and the Scheffe's post hoc tests.
Question 6) Run both a Pearson's correlation analysis and a Spearman's rank correlation analysis (with correlation coefficients and p-values). Include age (ridageyr), ratio of family income to poverty (indfmpir), and number of people in the household (dmdhhsiz). Limit the analysis to married females.
Attachment:- Biostatistical Programming.rar