Estimate the effect of DST on fatal accidents

Assignment Help Applied Statistics

Reference no: EM132300081

Assignment Questions -

1. In equation 15.13 Wooldridge gives the asymptotic variance of the IV estimator of β^{^}₁ in the simple regression case: σ^{^2}/(SST_xR_x,z²). Then in equation 15.43 gives the asymptotic variance of the IV estimator of β^{^}₁ in the multiple regression case: σ²/((SST)^{^}₂(1-R^{^}₂²)).

Obviously, one difference between the two equations is the presence of σ^{^2} versus σ². Beyond, this difference however, is 15.13 a special case of 15.43? Or is the simple regression case different? That is, if we applied the equation 15.43 to the simple regression case, would we obtain the same or different standard error compared to equation 15.13? Explain precisely including showing exactly why 15.13 is or is not a special case of 15.43.

Hint: Note carefully the elements of these formulas as defined in the text and course notes, in particular, the definition of R_x,z² and R^{^}₂².

2. Recall the key to Homework Problem 9, Question 4. In part D, we considered a model that had fixed effects for cities and different time trends for each city:

(2) ln(uclms_it) = a_i + c_itime + β₁ez_it + u_it

In the key, I showed that after first-differencing, this resulted in the following model:

Δln(uclms_it) = gr_rate_it = c_i + β₁Δez_it + η_y + Δu_it

where c_i represent fixed effects for cities, η_y is the appropriate set of year dummies, and gr_rate_itis the growth rate of unemployment claims.

Now I ask a related question, but assuming a different model. Specifically, suppose that we had instead postulated a model explicitly for the growth rate of unemployment claims as the dependent variable:

gr_rate_it = c_i + β₁ez_it + η_y + u_it

Again, c_i represent fixed effects for cities, η_y is the appropriate set of year dummies, and gr_rate_itis the growth rate of unemployment claims, but the variable ez_it is not first-differenced.

Hint: To answer this question below, I suggest you first explicitly create this growth rate variable:

. gen gr_rate = D.lnuclms

(22 missing values generated)

Given this model, either the within transformation or the FD transformation could be applied to difference out the fixed effects for cities. But it is inappropriate to apply -cluster- to make standard errors robust to within group (within city) correlation of the error terms because the number of groups is small, just 22 cities. That we do not have a number of groups sufficiently large to apply -cluster- makes the choice of estimation methods, within transformation or FD transformation, especially important. Which transformation is preferable here? Explain.

3. Recall the example on the earnings and education of twins that I included in the course notes "IV Estimation and Measurement Error."

A. Although we applied IV estimation to address measurement error in self-reported education, we also estimated a model which incorporated fixed effects for pairs of twins to control for ability. Recall that wage_itis the wage of twin t in family i, and educ_i2,2is the education level of twin 2 as reported by twin 2, educ_i1,1is the education level of twin 1 as reported by twin 1.

A student was interested in the example and asked for the data set I used in those notes. The student later came back with a question.

The student was correct that the difference is not due to machine imprecision. Answer the student's question.

B. The set up: To address measurement error, recall that in the course notes I applied the first-difference transformation and used the first difference of the variable educt_t, "the other twin's report of this twin's education", as the instrument for an individual's self-reported education. That is, in estimating the first differenced model

log(wage_i2) - log(wage_i1) = β₁(educ_i2,2 - educ_i1,1) + (u_i2 - u_i1)

we used (educ_i2,1 - educ_i1,2) as an instrument for (educ_i2,2 - educ_i1,1). Further, we showed that panel data had the further advantage of allowing us to also obtain consistent estimates if we assumed a model for the measurement error that included a fixed effect α_i allowing for both twins in family i to overstate or understate their education:

educ_i2,2 = educ^*_i2 + α_i + v_i2,2

educ_i2,1 = educ^*_i2 + α_i + v_i2,1

educ_i1,1 = educ^*_i1 + α_i + v_i1,1

educ_i1,2 = educ^*_i1 + α_i + v_i1,2

In the course notes we showed that

(i) the first-differenced equation, log(wage_i2) - log(wage_i1) = δ₀ + β₁(educ_i2,2 - educ_i1,1) + (u_i2 - u_i1), and the first-differenced instrument, (educ_i2,1 - educ_i1,2), have differenced away the fixed effect α_i and that

(ii) the differenced instrument (educ_i2,1 - educ_i1,2) is uncorrelated with the resulting error term in the first-differenced model so long as the v terms are uncorrelated with each other and with u.

Now the question for you:

Now we make our measurement error model more sophisticated by adding a fixed effect for twin t in family i(γ_it, t =1, 2) which allows for twin t in family i to overstate or understate education for both self and twin. These two fixed effects would imply the following equations for reported education levels:

educ_i2,2 = educ^*_i2 + α_i + γ_i,2 + v_i2,2

educ_i2,1 = educ^*_i2 + α_i + γ_i,1 + v_i2,1

educ_i1,1 = educ^*_i1 + α_i + γ_i,1 + v_i1,1

educ_i1,2 = educ^*_i1 + α_i + γ_i,2 + v_i1,2

where the v terms are the error term in this measurement error model which we assume are uncorrelated with each other and with u.

As we did in the course notes, substitute ("plug-in") using these equations for the true education levels educ^*_it in the first-differenced equation containing educ^*_it: log(wage_i2) - log(wage_i1) = β1(educ^*_i2 - educ^*_i1) + (u_i2 - u_i1) (e.g., substitute educ_i2,2 - α_i - γ_i,2 - v_i2,2 for educ^*_i2). Demonstrate that the differenced instrument (educ_i2,1 - educ_i1,2) is now correlated with the resulting error term in the first-differenced model.

C. Finally, you should recognize that there is an alternate differenced regressor and alternate differenced instrument such that the alternate instrument is uncorrelated with the resulting error term in the first-differenced model. Explain, making sure you explicitly state the alternate regressor and the alternate instrument.

4. In January 2005, Italy introduced regulations banning smoking in all indoor public places, with the aim of limiting the adverse health effects of second-hand smoke. Our research question concerns the effect of the smoking ban on hospital admissions for acute coronary events (aces). Acute coronary events are a short-term outcome with rapid onset, and we will assume that the acute effects of both active and passive smoking disappear quickly after the exposure is removed; that is, we assume that the effect of the ban on aces is immediate (rather than occurring with a lag, such as for cancer outcomes).

We examine time-series evidence on aces in one region of Italy. The data set smoking_ban.dta contains monthly data for the following variables:

year	years 2002-2006
month	months denoted 1-12
time	time variable = 1 in January 2002, and so on
aces	number of hospital admissions for acute coronary events
ban	dummy variable =1 starting in January 2005 when the smoking ban began
stdpop	age standardised population

(The age-standardized population is the number of individuals at risk for an acute coronary event, adjusted for differences in the age of the population in the governmental region over time.)

We assume the following model to estimate the effect of the smoking ban on aces.

(1) ln(aces_t) = β₀ + β₁ln(stdpop_t) + β₂ban_t + β₂time_t + θ_t + u_t

where θ_t is a set of monthly dummy variables.

A. Estimate the model using OLS. Give the precise interpretation for β^{^}₁ and β^{^}₂.

B. Test for AR(1) serial correlation. Explain the conclusion of your test.

C. Now estimate the same model using -newey- with number of lags equal to 12 ( -lag(12)-). Then, given these estimation results, test again for serial correlation. Did serial-correlation robust estimation work? Explain.

D. Now, I introduce a piece of new information. Our theory of risk for an ace implies that β₁should equal 1. Estimate a version of (1) that incorporates the restriction β₁ = 1. Further, in accordance with your conclusions from the earlier parts of this equation, apply either (i) OLS, (ii) serial-correlation robust standard errors, or (iii) serial-correlation robust standard errors after first-differencing.

5. Education level is thought to affect women's fertility. In particular, higher levels of education increase market wages available and thus increase the opportunity cost of time away from work rearing children. Suppose that our research question involved how women's fertility is affected by their education level but also whether there has been structural change in this effect over time. The data set kids.dta contains extracts of data from seven independent cross-sectional surveys of women from 35 to 54 years of age, in even numbered calendar years from 1972 to 1984. Type -describe- to get these variable descriptions:

variable name	variable label
year	72 to 84, even
educ	years of schooling
age	in years
kids	# children ever born
black	= 1 if black
east	= 1 if lived in east at 16
northcen	= 1 if lived in nc at 16
west	= 1 if lived in west at 16
farm	= 1 if on farm at 16
othrural	= 1 if other rural at 16
town	= 1 if lived in town at 16
smcity	= 1 if in small city at 16
meduc	mother's education
feduc	father's education

A. Suppose our model is (1)

Kids_i = β₀ + β₁educ_i + β₂age_i + β₃age_i² + β₄black_i + β₅east_i + β₆northcen_t + β₇west_i + β₈farm_i + β₉othrural_i + β₁₀town_i + β₁₁smcity_i + θ_t + u_i

where the variables are as described above and θ_t represents dummy variables for years.

Estimate model (1) assuming that the zero conditional meal assumption holds but obtain standard errors robust to heteroskedasticity. Give the precise interpretation of β₁.

B. Also of interest is whether their has been structural change over time in the effect of education on fertility. Estimate a model similar to (1) but modify it to allow for the effect of education on fertility to be different in each year. Formulate and test the null hypothesis of no structural change in the effect of education on fertility and explain the result of your test.

C. Now revert back to equation (1) but recognize that education is likely an endogenous regressor. Explain briefly why we should suspect endogeneity.

D. Note that we have variables representing the education level of each woman's mother and father. Estimate the reduced form equation and verify that the identification condition is met and that the instruments are not weak. Then estimate (1) using mother's education and father's education as an instrument for educ_i. Again, apply heteroskedasticity-robust estimation. Give the precise interpretation of β₁. Is the change in estimate of β₁ relative to the estimate in Part A consistent with your explanation of the nature of the endogeneity above?

E. In the model estimated in part D, is a test of overidentifying restrictions possible? If not explain why not. If so, apply the test and briefly explain the conclusion you draw from the test.

6. Daylight Savings Time (DST) is well described by the phrase "spring-forward, fall-back." Each year on the spring transition date, clocks are moved forward by one hour, from 2 am to 3 am. This alters the relationship between clock time and solar time by an hour, moving sunlight from the morning to the evening. But springing forward for DST likely also causes a decrease in the amount of sleep obtained, and this may have various deleterious affects, one of which is motor vehicle accidents.

The objective in this question is to estimate the effect of DST on the number of fatal accidents that occur in the United States. Our data set contains information on the total number of accidents per week involving one or more fatalities in the US from 2002 to 2011. The data set accidents_dst has the following variables, illustrated in the following list (see in attached file):

year is obvious enough, again data include the years 2002-2011
accidents is the count of the number of fatal accidents in the U.S. during the given week
dst is a dummy variable =1 if the DST is in effect that week
week is a variable representing weeks of the year constructed such that DST always begins at the beginning of week 24; i.e., in each year dst is always first equal to 1 in week 24.
holiday is a variable representing the number of holidays that fall in a given week; we expect that weeks with a greater number of holidays will tend to have a higher number of accidents.

I ask you to use this data to estimate the effect of DST on fatal accidents, measured as ln(accidents), using the Regression Discontinuity Design model. Make certain that

your model includes fixed effects for years
incorporates the variable representing the number of holidays occurring in a week
includes the appropriate quadratic terms allowing for a non-linear relationship between the number of accidents and the forcing variable week.

In addition to estimating the model, give the precise interpretation of your estimate of the treatment effect of DST.

Attachment:- Assignment & Data Files and Course Notes.rar

Reference no: EM132300081

Questions Cloud

Facial recognition cat flap mobile application : Software Engineering Project Proposal - Facial Recognition Cat Flap Mobile Application - develop a technical solutions for a business problem or opportunity

Find out why Finland scores so high in the Pisa : Find out why Finland scores so high in the Pisa and Sweden so low. The main question is why finland score higher than finland

Customer not satisfy before advance metering infrastructure : Customer not satisfy before advance metering infrastructure - how this will help to fix the problem to fix problem and make them satisfy)

How does social class shape religious affiliation : REL332 Religion and Society - Excelsior College - Take and defend a position in the debate around Weber's Protestant Ethic Thesis - How does social class shape

Estimate the effect of DST on fatal accidents : Use this data to estimate the effect of DST on fatal accidents, measured as ln(accidents), using the Regression Discontinuity Design model

Calculate the traveling cost : You are asked to provide with some documentation before you commence coding so that the client (the state government) is able to verify that the program you

Describe the contemporary situation in your community : Any of the theories presented in this unit have made an a priori (a presumption determination about the situation) decision to accept either a consensus model.

Identify differences between structural and process theory : Identify the main differences between structural and process theory explanations of delinquent behavior. Make sure that your explanation includes the social.

Prepare a project management plan : BN205 - Project Management - Melbourne Institute of technology - Demonstrate project leadership skills identify and assess risk in designing and executing

Reviews

len2300081

5/6/2019 3:36:48 AM

Instructions: An Attached assignment has to be done with STATA; three .dta files are attached that are the data sets for answering the assignment questions. The course notes are also attached. Please make sure the solutions are aligned with course notes. Please send me the .do file as well.

Write a Review

Required(*) Message

User Account

All Pages