Reference no: EM132781049
Homework
Question 1. Give an advantage and a disadvantage of the following methods: Maximum a posteriori estimation, Numerical integration, Gibbs sampling, Metropolis-Hastings.
Question 2. Consider the sample x = (0.12, 0.17, 0.32, 0.56, 0.98, 1.03, 1.10, 1.18, 1.23, 1.67, 1.68, 2.33), generated from an exponential mixture x ∼ (Πg(x) + (1 - Π) h(x)). All parameters Π, λ, µ are unknown. Note xi|Zi = 1 ∼ exp(λ) = g(x), xi|Zi = 2 ∼ exp(µ) = h(x).
(a). Show that the likelihood L(Π, λ, µ; x) can be expressed as L(Π, λ, µ; x) = E (Lc(x, Z)), where z = (z1, . . . , z12) corresponds to the vector of allocations of the observations xi to the first and second components of the mixture; that is, for i = 1, . . . , 12,
p (zi = 1) = 1 - p (zi = 2) = Πλ exp (-λxi)/(Πλ exp (-λx ) + (1 - Π) µ exp (-µxi))
(b). Construct an EM algorithm for this model, and derive the maximum likelihood estimators of the parameters for the sample provided above. Hint: See the session (5.4.2) and (5.4.3) Christian Robert and George Casella, pag. 166.
Question 3. Generate a sample X = (x1, . . . , xn), of size n = 5000 from the N (µ, σ2) distribution where µ ∼ N (µ0, τ2) and σ2 ∼ Inv - Gamma (n0/2, S0/2) are independent. (You have the freedom of choice for the values of the hyperparameters, µ, τ2, n0 and S0).
(a). Obtain the posterior mode of p (µ, σ2|X) using Newton-Raphson. (b). Calculate E (µ|X) and E (σ2|X) using the Normal approximation.
Question 4. Consider a random variable x described by an exponential distribution with parameter λ. x ∼ exp(λ). We are uncertain about the value of λ and can choose to model this uncertainty by defining a Gamma distribution over it, λ ∼ Gamma(α, β).
(a). Derive the maximum likelihood estimate (MLE) (λMLE).
(b). Obtain an analytic form of the posterior distribution, and derive the maximum a posteriori estimator (MAP), (λMAP ) as a function of α, and β.
(c). Generate N = 20 samples drawn from an exponential distribution with parameter λ = 0.2. Fix β = 100 and vary α over the range (1 - 40), using a step-size of 1.
• Compute the corresponding MLE and MAP estimates for λ.
• For each α, compute the mean squared error (MSE) of both estimates compared against the true value and then plot the mean squared error as a function of α.
• Now, fix α = 30, β = 100 and vary N over the range (1-50) using a step-size of 1. Plot the mean squared error for each N of the corresponding estimates and explain under what conditions is the MAP estimator better. Where Mean square error (MSE) is defined as:
MSE =1/N NΣi=1 (Yi - Yˆi)2
where Yi is the true value and Yˆi is the estimated value.
Question 5. Consider the model yi|σ2 ∼ N (0, σ2), i = 1, . . . , n, where σ2|b ∼ Inv - Gamma (a, b), b ∼ Gamma (1, 1).
(a). Derive the full conditional posterior distributions for σ2 and b.
(b). Write pseudo code for Gibbs sampling, i.e., describe in detail each step of the Gibbs sampling algorithm.
(c). Write your own Gibbs sampling code (R), report the posterior summary and plot the marginal posterior density, boxplot, trace for each parameter. Assume n = 10, a = 10, and Yi = i for i = 1, . . . , 10.
(d). Repeat the analysis with a = 1 and comment on the convergence of the MCMC chain.
Question 6. Consider the model yi = βxi + ui, ~ ui ∼ N (0, 1), i = 1, . . . , n. with the gamma prior distribution β ∼ Gamma(2, 1), β > 0. Verify the posterior distribution
p (β|y) ∝ β exp[-β] exp {-1/2 Σni=1 (yi - βxi)2} 1β∈(0,∞)
Note that this distribution does not have a standard form. Construct an MH algorithm to sample from this distribution with an independence kernel, where the kernel is a Student- t distribution truncated to the region (0, ∞), with five degrees of freedom, mean equal to the value of β that maximizes the posterior distribution (βˆ), and scale factor equal to the negative inverse of the second derivative of the log posterior distribution evaluated at (βˆ). Verify that
β = Σni=1(xiyi - 1) + √.(Σni=1 (xiyi - 1))2 4 Σni=2xi2/Σni=1 xi2
and that the scale factor is (1/β2 + Σni=1xi2). Generate a data set by choosing n = 50, xi from N (0, 1), and a value of β from its prior distribution. Write a program to implement your algorithm and see how well β is determined. You may try larger values of n to explore the effect of sample size, and, depending on the acceptance rate, you may wish to change the scale factor.
Attachment:- BAYESIAN HOMEWORK.rar