What percentage of data is missing for the variable smoke

Assignment Help Applied Statistics
Reference no: EM131458079

Biostatistics Assignment

Question 1 -

Cardiovascular disease (CVD) is a major cause of death in Australia. Risk factors for cardiovascular disease include high blood pressure, high cholesterol, smoking and alcohol, among others. Figure 1 below summarises data from a study of cardiovascular disease in 852 Australian men.  In Figure 1, the vertical axis is total cholesterol level and on the horizontal axis is the risk group. Risk groups were defined as the number of CVD risk factors of each individual: (i) low risk (0-1 risk factor), (ii) medium risk(2-3 risk factors) and (iii) high risk (4+ risk factors).

1464_Figure.png

a) Describe how the distribution of total cholesterol differs between risk groups in Figure 1. Limit comments to comparing the location, spread, shape of distribution, and maximum/minimum of the Total Cholesterol. Note, there is no need to report the numerical value of the summary statistics you use, just refer to the name of the summary statistic you are comparing (e.g., median increases/decreases).

b) Desirable total cholesterol level is thought to be below 5.2 mmol/L. High cholesterol is defined as total cholesterol above 6.2 mmol/L. Using Figure 1, roughly estimate the proportion of Australian men in each risk group with (i) desirable cholesterol and (ii) high cholesterol.

Question 2 -

In order to answer Question 2, you will need to use the Stata dataset assignment1.dta which can be downloaded from the folder "Assignment 1" in the Assessment area on the LMS.

The Wound Healing Society defines a chronic wound as one that has failed to proceed through an orderly and timely reparative process to produce anatomic and functional integrity within an expected period. Chronic wounds represent a significant annual burden on the Australian health care system, with direct health care costs reaching US$2.85 billion. Several factors can interfere with one or more phases of the wound healing process, thus causing improper or impaired wound healing. Such factors include infection, age, stress, diabetes, obesity, medications, alcoholism, smoking, and nutrition. A better understanding of the influence of these factors on repair may lead to therapeutics that improve wound healing and resolve impaired wounds.

This question concerns a new hypothetical study investigating the association between wound healing and alcoholism. As part of the study, wound patients were sampled from a randomly selected hospital and given uniform treatment. For each patient, the wound circumference was measured at baseline and 12 weeks after baseline to determine the reduction in wound circumference in week 12 (i.e., 12 weeks after baseline).  The investigators measured the following variables:

Variable

Description

wndcir

Relative difference between baseline and week 12 wound circumferences; i.e.,

(baseline circ. - week 12 circ.) / baseline circ.

age

Age (years)

sex

Gender (male/female)

stress

Stress score (Possible scores 0-10; 0 = no stress, 10 = maximum stress)

bmi

Body mass index (kg/m2)

diab

Type II diabetes (yes/no)

smoke

Smoking (ever/never)

alc

Alcohol consumption per week in millilitres

infect

Was the wound infected at any time in twelve weeks? (yes/no)

The main outcome variable of the study was wndcir, which measures the healing progress of the wound. The progress of healing (wndcir) can be interpreted as follows:

  • Large positive values: the wound is healing quickly (i.e., good healing progress).
  • Positive values close to zero: the wound is healing slowly (i.e., slow healing progress).
  • Zero: No change in the wound circumference.
  • Negative values: the wound is getting worse; the circumference is increasing.

In order to answer the questions below, you will need to create two new categorical variables: alccat and bmicat. The new categorical variable alccat will be derived from the existing numerical variable alc. The variable alccat represents whether or not individuals drink more than the average amount of alcohol per week estimated to be 186 mL/week. The new categorical variable bmicat will be derived from the existing numerical variable bmi. The new variables alccat and bmicat should consist of the following categories:

alc(mL/weel)

alccat

bmi(kg/m2)

bmicat

< 186

average

< 19

underweight

>= 186

above average

19 - 24

normal weight

 

 

25 - 29

overweight

 

 

> = 30

obese

a) Identify and list the names of all variables in the data set that have missing observations.

b) What percentage of data is missing for the variable smoke? What percentage of data is missing for the variable smoke for each BMI category (i.e., of all individuals that are underweight/normal/overweight/obese, what is the percentage of data missing for the variable smoke)?  Of all individuals that consume below the average amount of alcohol per week (<186 mL/week), what percentage are female and what percentage are male?

c) Use Stata to produce a frequency histogram of alcohol consumption per week in millilitres (variable alc) for each of the four categories of BMI. Look up the help file for the histogram function for the relevant options to make the following changes to the graph:  

  • use the by() option to display the four histograms in a single plot;
  • display the percentage, not the density, on the vertical axis;
  • plot 25 bars (or bins) per histogram.

Copy the graph directly into your assignment document by clicking on edit/copy in the Stata graph window. You may also use the "File -> Save As" feature in Stata to save the graph as an image that you can later import into Microsoft Word.

From the histogram, what shape is the distribution of the variable alc for individuals with "normal weight"?

d) Provide a table that summarises the distribution (sample size, mean, standard deviation, minimum, 25th   / 50th   / 75th percentiles, maximum) of the wound circumference (wndcir) separately for smokers and non-smokers. Ensure that the table is formatted properly (please do not copy and paste directly from Stata output). Using this table, briefly describe the differences in the outcome variable wndcir between smokers and non-smokers. Do smokers or non-smokers heal faster on average?

e) Use Stata to produce an appropriate graph to display the relationship between the outcome variable wndcir  and  the exposure variable  wound infection history (infect). Based only on this graph, do individuals with history of infections heal faster or slower, on average?

The investigators of the study are not sure how alcohol relates to wound healing but they suspect that drinking alcohol above the weekly average can slow down the healing process.

f) Using Stata, calculate and interpret the difference in the mean wound circumference (outcome variable  wndcir)  between  individuals who drink below  (<186 mL/week)  and  those that  drink  above  the  average amount of alcohol per week  (>=186 mL/week).

Using Stata, calculate and interpret a 95% confidence interval for the difference in the population mean wound circumference (outcome variable wndcir) between individuals who drink below (<186 mL/week) and those that drink above the average amount of alcohol per week (>=186 mL/week).  

g) Using your answers to question 2(f), what can we conclude about the association between drinking alcohol and the wound healing process?

Question 3 -

Note: This question does not require Stata.

Table 1 below gives the results for two randomised controlled trials comparing acupuncture treatments versus standard care in patients with back pain. The outcome measure was the SF-36 bodily pain score; this score is normally distributed and ranges from 0 to 100, where a score of 100 implies 'no pain'.  An increase of 10 units in the SF-36 bodily pain score corresponds to a clinically important difference.

Table 1.

Trial

n per group

Difference in sample means of SF36 (acupuncture- standard care)

95 % confidence interval for difference in population means

p-value

1

??

3.00

0.26, 5.74

0.032

2

??

2.75

-3.13, 8.63

0.359

The above two trials are the only studies currently available with data comparing acupuncture treatments and standard care. After reviewing the findings of the above two trials, a general practitioner decides to recommend acupuncture treatments to patients suffering from back pain.

a) Do you agree with the general practitioner's decision? Using all the information provided in Table 1, give reasons as to why or why not.

b) Using the information provided in Table 1, which trial has the larger sample size? Explain your answer.

Question 4 -

Note: This question does not require Stata.

A random sample of 61 airline pilots, working for British Airways, had their systolic blood pressure measured. The sample mean was 107.4 mmHg and the sample standard deviation was 6.1 mmHg.

Assume that systolic blood pressure is normally distributed within the population and that the sample mean and sample standard deviation provide reasonable estimates of the population parameters.  

a) Estimate the proportion of British Airways pilots with a systolic blood pressure between 100 mmHg and 118 mmHg.

b) Calculate the range of systolic blood pressures where the middle 90% of airline pilots lie within.

c) Calculate and interpret a 99% confidence interval for the population mean systolic blood pressure.

d) Calculate and interpret a (two-sided) p-value to test the null hypothesis that the population mean systolic blood pressure is 105.8 mmHg.

Reference no: EM131458079

Questions Cloud

Why is a multiperiod binomial model a better approximation : Why is a multiperiod binomial model a better approximation to the actual stock price process than the single period binomial model?
Business valuation models that are popular : Define APV. How does it differ from NPV? Identify and discuss at least two other business valuation models that are popular.
Department of defense management meeting : Karenna just returned from a Department of Defense management meeting where the dismal state of the budget was the principle topic.
The role of term structure theories in policy effectiveness : The role of term structure theories in policy effectiveness: What were the goals of “Operation Twist”?
What percentage of data is missing for the variable smoke : Biostatistics POPH90013 Assignment. What percentage of data is missing for the variable smoke? What percentage of data is missing for the variable smoke
Shoppers with different disposable income levels : If stores are placed in different locations, in different locations being frequented by shoppers with different disposable income levels.
The objective for managing inventory : The objective for managing inventory is to
What did you learn about diversity of latin american music : Write a brief report chronicling your experiences. What did you learn about the diversity of Latin American music? About musical tradition and transformation?
The goal of working capital management is to : The goal of working capital management is to

Reviews

len1458079

4/11/2017 2:26:26 AM

This assignment assesses material from the first six lectures, the first six tutorials and first two Stata Practical Sessions. Your assignment should be submitted via LMS as a Microsoft Word document and should not be longer than 8 pages. Unless you are asked to do so, please do not include any Stata output in your assignment document. Instead, format any results you want to show in a way that would be suitable for inclusion in a study report or research paper. Note: you do not need to estimate the sample sizes for the two trials. It is enough to state which trial has a larger sample size (Trial 1 or Trial 2) and why you have chosen that answer.

Write a Review

Applied Statistics Questions & Answers

  The variance of each of the following sets of scores

The variance of each of the following sets of scores

  Research and analyse health care performance data

Research and analyse health care performance data in at least two countries one of which must be Australia - write an essay of 3000 words comparing the five (5) areas for each country.

  Suppose a,b and c are three events of a sample space

Suppose A,B and C are three events of a sample space, S, all of which have no outcomes in common.  It is possible that P(A) = 0.4, P(B) = 0.5, and P(C) =0.6. explain your answer

  Find the probability that at least 100 babies are female

Find the probability that exactly 15 babies are female and find the probability that no more than 12 babies are female.

  Explain when a z-test would be appropriate over a t-test

Explain when a z-test would be appropriate over a t-test - What are some experiments for which you might want a lower alpha level

  Generate a frequency distribution and histogram

Generate a frequency distribution and histogram describing this information and comment on the extent to which some workers appeared to be receiving an especially high or low number of e-mails.

  Prepare a report using the numerical methods

Prepare a report using the numerical methods of descriptive statistics presented in this module to learn how the variables contribute to the success of a motion picture.

  Astrological signs tend to be safer drivers the people

An insurance investigator has observed the people with some astrological signs tend to be safer drivers the people with other signs. Using insurance records, the investigator classified 200 people according to their astrological signs and whether or ..

  Are your independent variables truly independent

Statistical reasons and logic for why you selected the independent variables you selected. For each independent variable you must do the following: Are your independent variables truly independent? Is the proposed "Independent Variable" dependent o..

  Statistics helps us make decisions based on data analysis

Keep your eyes and ears open as you read or listen to the news this week. Find/discover an example of statistics in the news to discuss the following statement that represents one of the objectives of statistics analysis: "Statistics helps us make de..

  One assume normality for two proportions

Explain the assumptions underlying the two-sample test of means.When can one assume normality for two proportions?

  Construct a 90% confidence interval for (p1-p2)

Construct a 90% confidence interval for (p1-p2)

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd