Derive the full posterior conditional distribution

Assignment Help Advanced Statistics
Reference no: EM131462820

Question 1. The table schedules Year below gives the number of fatal accidents and deaths on airline flights per year over a ten-year period.

Year

Accidents
1976

24

1977

25

1978

31

1979

31

1980

22

1981

21

1982

26

1983

20

1984

16

1985

22

(a) Assume that the number of fatal accidents each year independently follow a Poisson(θ) distribution. Derive Jeffry's prior for this model. Derive the posterior distribution of θ under this prior?

(b) Obtain posterior samples from the model described in part (a). Provide the density plot of your samples as well as the 95% posterior credible interval and MAP estimate for θ|data.

(c) Now obtain samples from the posterior predictive to infer on the number of fatal accidents in 1986. Provide the density plot of your samples as well as the 95% posterior credible interval and MAP estimate for y~|data.

(d) Assume now that the numbers of fatal accidents in each year t independently follows a Poisson(θt) where log(θt) = α + βt. Choose a reasonable noninformative prior for p(α, β). Write our the joint posterior for p(α, β|data) and formally write out a Metropolis algorithm that updates α and β together (be sure to be specific about the index of iterations).

(e) Implement your algorithm in (d). Provide discussion and plots regarding your tuning parameter(s), burn-in, autocorrelation, acceptance, and thinning. Obtain 2000 independent posterior samples from p(α, β|data) and plot the joint and marginal posterior densities. Obtain MAP and 95% credible intervals for the posterior rate of fatal accidents per year (i.e., θt|data) at each year: 1976 -1985. Discuss what happens to θt|data over time in context of the problem.

(f) Using your posterior samples of α and β to predict the number of fatal accidents in the year 1986. Provide the density plot of your predicted samples as well as the 95% posterior credible interval and MAP estimate. Discuss and compare these results to the results in (c). Which model seems more appropriate for these data? Defend your answer.

Question 2. The data file hearing.txt is from an experiment to calibrate word lists used to measure the hearing ability of subjects. The four word lists had been designed so that they should be equally difficult to perceive, but were designed for normal-hearing subjects in an environment without background noise. The data in this experiment were collected in the presence of a noisy background. Each column is a word list, and each row is a subject. The entry is their score on that list (each subject was tested on all four lists). We will consider a two-way ANOVA model such that we will assume a Normal likelihood for each with mean that depends on both the subject and the list. In other words we will consider both a subject effect (θh) as well as a list effect (θj). We will assume conjugate priors. The full hierarchical model is given by:

yhjh, Φj2) ~ N(θh + Φj, σ2)

θh|μ, σ2 ~ N(μ, σ2)

θj2 ~ N(0, σ2/4)

μ|σ2 ~ N(30, σ2/9)

σ2 ~ Γ-1(1, 1)

for h = 1,......n and j = 1,...... k with n = 24 and k = 4.

(a) Write out the joint likelihood, f(y|θ, Φ, σ2).

(b) Derive the full posterior conditional distribution for θh. That is find the form of f(θh-h, Φ, μ, σ2, y)

(c) Derive the full posterior conditional distribution for Φj. That is find the form of f(Φj-h, θ, μ, σ2, y)

(d) Derive the full posterior conditional distributions for the hyperparameters: f(μ|Φ, θ, μ, σ2, y) and f(σ2|Φ, θ, μ, σ2, y)

(e) Fit the model with MCMC. Show your trace plots for μ for at least three θh's, and for at least two Φh's of your choice. Remove burn-in as appropriate. Be sure you obtain at least 2000 independent posterior samples.

(f) What are the maximum likelihood estimates of the Φh's? Make a plot comparing the MLE's to you estimated posterior means of the θh's.

Use the abline(0,1) to add the y = x line to you pot. Comment on what you see. How does this Bayesian analysis compare to a simple frequentist (mle) one?

(g) Of interest to the researchers is whether the lists have the same level of difficulty. Plot the densities of the posterior for all four θj's. Construct 95% credible intervals for each θj and see if they include zero. What can you conclude about the lists?

Question 3. Consider the Load.txt dataset which was collected from a study that examined the heating load and cooling load requirements of buildings (that is, energy efficiency) as a function of building parameters. The dataset contains eight (p = 8) attributes (or features, denoted by X1...X8) and two responses (or outcomes, denoted by y1 and y2). The aim is to use the eight features to predict each of the two responses. There are a total of n = 768 cases.

X1 | Relative Compactness
X2 | Surface Area
X3 | Wall Area
X4 | Roof Area
X5 | Overall Height
X6 | Orientation
X7 | Glazing Area
X8 | Glazing Area Distribution
Y1 | Heating Load 
y2 | Cooling Load

Source: A. Tsanas, A. Xifara: Accurate quantitative estimation of energy perfo rmance of residential buildings using statistical machine learning tools, Energy and Buildings, Vol. 49, pp. 560-567, 2012

For this exam, you will explore which explanatory variables are important in predict¬ing the heating load and the cooling load via Bayesian lasso regression. Specifically, you will fit the following model:

y ~ N(1nμ + Xβ, σ2 Inxn)

β|∑o ~ N(0, σ20)

where ∑0 = diag(τ12, τp2)

T2|λ ~ ΠPj=1 Exp(λ2/2) note that λ2/2 is the rate parameter

Assume the following priors: p(μ) ∝ 1, p(σ2) ∝ (σ2)-1, λ2 ~ Γ(0.01, 0.01). Provide a detailed analysis of lasso variable selection on these data.

(a) Fit the model above to the Load dataset using y1 as the response and X1 - X8 as explanatory variables. Summarize your results via plots/tables and discussion.

(b) Fit the model above to the Load dataset using y2 as the response and X1 - X8 as explanatory variables. Summarize your results via plots/tables and discussion.

(c) Compare the lasso results in (a) and (b)

4. Read carefully through Roderick Little's 2011 paper Calibrated Bayes, for Statistics in General, and Missing Data in Particular. Provide a detailed report (minimum 1 full page) of the issues and ideas presented in this paper. Summarize the pros and cons of the various imputation methods. What is you personal opinion on missing data imputations?

Article - Calibrated Bayes, for Statistics in General, and Missing Data in Particular by Roderick Little

https://www.dropbox.com/s/3mngxati2qr9gyy/Homework.zip?dl=0

Verified Expert

In this assignment we have analyzed the Bayesian lasso regression nd also other Bayesian methods in R. All the results are in the Microsoft word file. Also we have summarized the insights of a research paper.

Reference no: EM131462820

Questions Cloud

Describes the characteristics and roles you hope to embody : describes the characteristics and roles you hope to embody as a counselor and the counselor dispositions that you want to bring with you.
What is your strategy for marketing : ou re the new CEO and you have one brand called "UNIQLO" ( IT IS A GLOBAL TEXTILE COMPANY). What is your strategy for marketing
Research health organizations and hospitals in your area : Research health organizations and hospitals in your area. Briefly define advantages and disadvantages for each as a potential healthcare administrator.
Discuss the implications of producing the product or service : Think of a new or revised product or service that you would like to see on the market. Discuss the implications of producing the product or service relative.
Derive the full posterior conditional distribution : Derive Jeffry's prior for this model. Derive the posterior distribution of θ under this prior - Derive the full posterior conditional distribution
Pick one of the potential problems : Pick one of the potential problems, such as one involving a sexual relationship, high-risk activity, or other bad health habit.
Calculate bond equivalent yield and effective annual return : Calculate the bond equivalent yield and effective annual return on a jumbo CD that is 135 days from maturity and has a quoted nominal yield of 6.70 percent.
Role in female choice and the evolutionary reasoning : A group of students wants to determine if the size of a man's pupil plays a role in female choice and the evolutionary reasoning behind it.
Prepare essay discussing the philosophies of existentialism : Prepare a one page essay discussing the philosophies of Existentialism. How and why did this notion come about?

Reviews

inf1462820

5/8/2017 6:23:14 AM

I am attaching the textbook and the hearing data in csv format. 22776912_1GelmanBayesianDataAnalysis.pdf 22776986_2hearing.csv Incomplete work. Question 2. Missing parts a) write the joint likelihood b) derive the full posterior of theta c) derive full posterior d) -//- need clarification on question 3 Please respond ASAP In question 2 part a joint likelihood mean and standard deviation is calculated as asked in the question. In part b I have derived the posterior distribution similarly in part c. All the parts are complete and the solution is in R code. Please check and let me know if there is any further doubt.

len1462820

4/14/2017 6:27:20 AM

This is a bayesian statistics course(graduate level). Requires R programming for some of the problems. Need stepbystep solutions along with R file back.What are the maximum likelihood estimates of the Fh's? Make a plot comparing the MLE's to you estimated posterior means of the ?h's.

Write a Review

Advanced Statistics Questions & Answers

  Relationship between speed, flow and geometry

Write a project proposal on relationship between speed, flow and geometry on single carriageway roads.

  Logistic regression model

Compute the log-odds ratio for each group in Logistic regression model.

  Logistic regression

Foundations of Logistic Regression

  Probability and statistics

The tubes produced by a machine are defective. If six tubes are inspected at random , determine the probability that.

  Solve the linear model

o This is a linear model. If your model needs a different engine, then you need to rethink your approach to the model. Remember, there are no IF, Max, or MIN statements in linear models.

  Plan the analysis

Plan the analysis

  Quantitative analysis

State the hypotheses that you are going to test.

  Modelise as a markov chain

modelise as a markov chain

  Correlation and regression

What are the degrees of freedom for regression

  Construct a frequency distribution for payment method

Construct a frequency distribution for Payment method

  Perform simple linear regression

Perform simple linear regression

  Quality control analysis

Determining the root causes

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd