Reference no: EM132411876
Bayesian Data Analysis
Question 1. The Department of Transportation in Florida has collected in- formation about accidents that occurred during 2017. Safety engineers are interested in estimating whether the use of seat belts is associated with the severity of crashes that involve injuries. Data are shown in the following Table.
|
Fatal injuries
|
Non-fatal injuries
|
Total injury crashes
|
No seat belt
|
142
|
1425
|
1567
|
Seat belt used
|
91
|
1834
|
1925
|
Let nno, yno denote the number of accidents and the number of fatalities, respectively, when the seat belt is not in use. Similarly, let nyes, yyes denote the number of accidents and the number of fatalities, respectively, when the seat belt is in use. Assume that yno and yyes are independent, with binomial sampling distributions and probability of success θno, θyes, respectively.
Questions
(a) Find the posterior distributions of θno and θyes when the priors are independent uniform distributions.
(b) In your own words, explain how you might draw values of θno from its posterior distribution. Be precise and write down all the steps.
(c) Plot the posterior distributions of θno and θyes in the same graph, and interpret the results.
(d) The relative risk of a fatality is computed as the ratio λ = θno/θyes. Find the posterior distribution of the relative risk in this study, and plot the resulting posterior distribution. Suppose that you are explaining the results of the study to a person who does not know about Bayesian statistics. In your own words, explain what the values in the plot of the posterior distribution mean and how they can be interpreted in the context of severity of crashes.
Question 2. This part requires an in-depth data analysis. The main objective of the analysis is to provide Bayesian modeling for assessing treatment effects on presence of Haemophilus influenzae type b infec- tion. The modeling relies upon considering the treatment, possible effects of treatment compliance, and the timing of the testing given to participants. Haemophilus influenzae type b, or Hib, is a common bacteria that yields many infections. Hib has been the cause of most serious diseases such as meningitis. The highest risk group for Hib is infants and young children, however, anyone can also get Hib. The dataset (infection.txt, found in Canvas, contains information on the presence of Hib in children from the particular study. The study tested the effects of a drug on 50 children with a history of otitis media. The subjects were randomized to being in either the drug or the placebo group and whether they received reminders to take the drug. Data were collected at weeks 0, 2, 4, 6 and 11.
The data set (infection.txt) contains 220 observations. There are 6 variables contained within the dataset as defined below. Note y is the response variable.
Variable
|
Description
|
Values
|
y
|
Presence (y) or absence (n)
|
y , n
|
ap
|
Active (a) or placebo (p)
|
a , p
|
hilo
|
High or low compliance
|
hi , lo
|
week
|
Week of test
|
0, 2, 4, 6, 11
|
ID
|
Subject ID
|
Xmm, Ymm, Zmm
|
trt
|
Recoded using ap and hilo
|
placebo, drug, drug+
|
Questions
The Bayesian analysis of the data described above will be organized as follows:
(a) First, perform exploratory data analysis by looking at graphs and numerical summaries in addition to performing some preliminary data preparation to cleanse the data,
(b) Second, specify a fully Bayesian model development for assessing the treatment effects on presence of Haemophilus influenzae type b infection.
(c) Third, describe details of the fitted models and a comparison of models if any,
(d) Fourth, give a summary of the Bayesian analysis with appropriate interpretations, and
(e) Finally, put the codes used in the analysis and any additional information in the Appendix.