Reference no: EM132403224
Econometric Methods
Assignment
1. Let A be a symmetric (n × n)-matrix with spectral decomposition
A = Σni=1 λiviv'i ,
where λi and vi are the eigenvalues and eigenvectors, respectively, of A such that v1, . . . , vn form an orthonormal basis of Rn. Define
A− = Σni≠1 1/λiviv'i , .
(a) Show that A− is a generalized inverse of A, that is, AA−A = A.
(b) Verify that A− is a symmetric and reflexive generalized inverse (i.e., (A−) 0 = A− and A is a generalized inverse of A− so that A−AA− = A−).
2. Let X be an (n × k)-matrix (not necessarily of full rank) and P = X(X'X) −X'.
(a) Explain why a generalized inverse (g-inverse) of X0X does exist.
(b) Show that P is invariant w.r.t. the choice of the g-inverse of X'X (that is, any choice of the g-inverse of X'X gives the same matrix P).
Hint: Verify first that P X = X holds independent of the choice of the g-inverse.
(c) Show that P is the orthogonal projection from Rn onto R(X).
(d) Show that P is symmetric and idempotent.
Hint: Use Problem 1.
3. Show that for any (m × n)-matrix A, there exists a g-inverse A−.
Hint: Set A− = (A'A)− A'
(why does (A'A) − exist?).
4. Prove that an (n × m)-matrix A− is a g-inverse of a given (m × n)-matrix A if, and only if, x = A−b is a solution to the system of linear equations Ax = b for all b ∈ R(A) ⊆ R m.
5. Let Yi ∼ Pθ = N (θ, 1) i.i.d., i = 1, ..., n, and consider the null hypothesis H0 : θ = θ0. With the test statistic λ = √n(Y¯ −θ0) and its observed value λobs = √n(y¯−θ0), the pvalue for the right-tailed test H0 vs. Hr1 : θ > θ0 is given by pr = pr(y) = Pθ0(λ > λobs).
(a) Define the p-values pl and p for the left-tailed test (H0 vs. Hl1 : θ < θ0) and the two-sided test (H0 vs. H1 : θ≠ θ0), respectively. Represent the p-values pr, pl and p by means of the distribution function Φ of the standard normal distribution, and show that p = 2 min{pl , pr}.
(b) Invert the UMP α-test δ∗ for testing H0 vs. Hr1 (which rejects H0 if and only if λobs > z1−α , cp. slide 39 of Chapter 2 and Problem 11 of Assignment No. 1) in order to get a (1 − α)-confidence interval for θ. Is that confidence interval exact, and does it make sense to assess its properties by the (expected) length? Why is it reasonable to speak in this case about a (1 − α) (lower or upper?) confidence bound for θ?
6. Suppose it holds
yi = α1 + α2xi2 + α3xi3 + α4zi + α5xi2zi + ui with E(ui |xi2, xi3, zi) = 0,
but zi is not observable [so that the parameters α = (α1, . . . , α5)' cannot be estimated by OLS]. Assume that zi depends linearly on the observable variables xi2, xi3 by
zi = γ1 + γ2xi2 + γ3xi3 + vi , E(vi |xi2, xi3) = 0.
(a) Show that the model can be rewritten as
yi = β1 + β2xi2 + β3xi3 + β4x2i2 + β5xi2xi3 + εi with E(εi |xi2, xi3) = 0.
(b) Find the partial effects of xi2 and xi3 on E(yi |xi).
(c) Can the OLS estimator of β = (β1, . . . , β5)' be given in closed form? Explain briefly.
(d) Show that E(εi) = 0.
(e) What can you conclude about E(εi |xi2, xi3, x2i2 , xi2xi3)?
(f) Show that ‘any’ function f(xi2, xi3), possibly vector-valued with E||f (xi2, xi3)εi||2 < ∞, is uncorrelated with εi.
7. (a) Show that the OLS estimators of the slopes in a multiple linear model that contains an intercept are obtained by transforming the data to deviations from their means and then regressing the dependent variable y in deviation form on the explanatory variables, also in deviation form (without intercept). Explain how you can get the OLS estimator of the intercept from the slope estimators.
(b) Verify the bias formula on slide 37 of Chapter 3.
8. Show that the coefficient of determination, R2 , for a linear regression model containing an intercept is invariant with respect to linear transformations of the dependent variable, whereas the uncentered coefficient of determination, R˜2 , is not. That is, R2 remains unchanged after changing the scale and units of yi by γ and α, respectively, leading to new observations
y∗i = α + γyi (α, γ ∈ R, γ ≠ 0),
whereas a change of units of yi (by α, say) leads to a change of R˜2 .
9. Consider the linear regression model with intercept
y = Xβ + ε, rk(X) = k. ...........(1)
Extending this model by one regressor leads to the model
y = Xθ + zγ + u, rk(X, z) = k + 1, ...........(2)
where z is an n-dimensional column vector. Let e and ˆu denote the vector of OLS residuals in models (1) and (2), respectively.
(a) Verify that ||uˆ||2 = ||e||2−γˆ2||Mz||2, where ˆγ denotes the OLS estimator of γ in model (2) and M = In − X(X'X)−1X' is the residual maker in (1).
(b) Let R¯2 (i) be the adjusted coefficient of determination in model (i), i = 1, 2. Use the result in part (a) to show that R¯2(2) > R¯2 (1) if and only if |tz|2 > 1, where tz denotes the t-statistic for testing H0 : γ = 0 in model (2).
10. Verify that, under the assumptions of Section 2.2.5, the best linear prediction P∗ (X) = α∗ + x'β∗ is also the best linear approximation to P(X) = E[Y |X], that is,
minα,β E[P(X) − α − X'β]2 = E[P(X) − P∗(X)]2 .
11. (a) Show that in the regression model
yi = ( k∏j=1 xβjij ) εi with xij > 0 and E(εi |xi) = 1 (i = 1, . . . , n)
the parameter βj is exactly the elasticity of E(yi |xi) with respect to xij.
(b) Verify that the parameter βj in the model yi = exp(x'iβ)εi with E(εi |xi) = 1 is exactly the semielasticity of E(yi |xi) with respect to xij .
12. Read carefully the following statements about the normal linear model y = Xβ + ε, ε ∼ N (0, σ2 In), where the (n × k)-matrix X has full rank k and β = (β1, . . . , βk)'. Are they true or false? Explain.
(a) Since the OLS estimator βˆ = (βˆ1, . . . , βˆk) 0 is BLUE of β, ˆγ1 = Pkj=1 βˆj and ˆγ2 = βˆ22 are also BLUEs for estimating γ1 = Pkj=1 βj and γ2 = β22 , respectively.
(b) The hypothesis that the OLS estimator is equal to zero can be tested by means of a t-test.
(c) If the absolute t-value of a coefficient is smaller than t0.975n−k , we accept the null hypothesis that the coefficient is zero with 95% confidence.
(d) If an explanatory variable has a significant impact on the dependent variable at the 5% level, then its impact is also significant at the 10% level.
(e) If an explanatory variable has a significant impact on the dependent variable at the level α and its estimated coefficient is positive, then it has also a significant positive impact at the level α.
(f) The smaller the p-value, the stronger the evidence against the null hypothesis provided by the data.
(g) The p-value is the probability that the null hypothesis is true.
13. Empirical example (you can use any of the following software: EViews, Stata, R). Consider the following linear regression model (LRM)
lwage = β1 + β2 · educ + β3 · exper + ε ............(3)
and use the data in “em hw2 lwage.raw” (or “em hw2 lwage.xls”) for this exercise to answer the following questions.
Variable name
|
Variable description
|
Lwage
|
Logarithm of hourly wage
|
Educ
|
Years of education
|
Exper
|
Work experience in years
|
Urban
|
Dummy for urban areas
|
Married
|
Dummy for married individuals
|
meduc
|
Mother's education
|
feduc
|
Father's education
|
(a) Estimate the linear regression model in (3) using OLS. Give an economic interpretation of the estimated coefficients and comment on their significance. Does education and experience explain a lot of the variation in lwage?
(b) Extend equation (3) to allow the effect of educ to depend on the level of work experience and estimate the new model by OLS. Let θ be the semielasticity of wage w.r.t. education after 10 years of work experience. Estimate θ and test whether it is significantly different from zero at the 1% significance level.
(c) State [in the model of part (b)] the null hypothesis that the return to education does not depend on the level of experience. What do you think is an appropriate alternative?
(d) Extend the LRM in (3) in a way that allows lwage to differ across four groups of individuals: married and urban, married and rural, non-married and urban, non-married and rural. What is the estimated lwage differential between married urban and non-married urban? Is it significant at the 5% level?
(e) Extend the LRM in (3) by the variables urban, married, feduc and meduc and estimate the model again. Comment on differences compared to your answer from part (a).
(f) For the model considered in part (e), state the null hypothesis that mother’s and father’s education have jointly no impact on lwage. Perform a test at the 5% significance level.
(g) What would you assume about the error term ε to justify the OLS estimation and the tests carried out in the above models?
Attachment:- Em_ha2019_2.rar