Reference no: EM13895288
The following results are based on a random sample of 534 people from the Current Population Survey (CPS). The dependent variable in each regression is log(wage). (See results on the next page).
A. How do we interpret the coefficients for education, male, and married in Model 2? Would changing the reference category from female to male or unmarried to married impact our estimation? Why would we choose one or the other as the reference/base category?
B. One might suspect that age and experience are both good predictors of someone’s wage. Why, then, do they both appear to be statistically insignificant in Model 3 and what, if anything, can we do anything to fix this?
C. Why would we want to include a quadratic version of a variable in a regression model and should we be worried about violating any of our Gauss-Markov assumptions as a result? Is agesqr appropriate to include in this series of models? Why or why not?
D. How do we interpret the coefficients for age and agesqr in Model 4? Based on Model 4, at what point is there a change in direction of the impact of age on someone’s wage in this sample?
E. What is the purpose of including the interaction term edmalein our regression? How do we interpret its coefficient and what is the data telling us about the impact sex has on the returns to education?
F. What is difference between the R2 and the adjusted R2? Why are the adjusted R2 estimates lower than the R2 estimates in each model, and which measure should we trust more and why?
G. Many labor economists have studied the “gender pay gap,” the phenomenon where women are paid less than men on average. It is commonly expressed as female earnings as a percentage of male earnings. Given these results, is there evidence of a gender pay gap in our sample? Are there any other important factors we should consider that were not included in our models concerning why women might be paid less than men? [Hint: you may want to think back to micro/macro theory here, but it’s not necessary].
H. Is there any evidence of a “marriage premium” in our models? Why would married be statistically significant in Models 2 and 3 but not in Models 4 and 5?
I. Suppose you are worried about heteroskedasticity in your models. You generate residuals from Model 5 and run a regression of the squared residuals on your independent variables and get an R2 of 0.015. Is there heteroskedasticity present, and is this result expected? Explain