Reference no: EM131095216
Major Assignment 1: Educational Attainment in Canada
Using the same data as you used for Class Assignment 2.
You should already have a years of schooling variable generated.
1. Tabulate your estimate of overall years of schooling against highest degree obtained - Table 1. Do the results make sense? Explain.
2. Produce a table showing the means, standard deviations, minimum and maximum of: (a) overall years of schooling; (b) age; and (c) total income for men and women separately - Table 2. Are men or women more educated on average in your sample?
Generate an age variable.
generate age = 27 if AGEGRP==[value for agegroup 25-29]
replace age = 32 if AGEGRP==[value for agegroup 30-34]
.... etc
3. Construct a graph showing mean years of schooling by age, with separate lines for men and women - Figure 1.
i. preserve
ii. collapse (mean) yrschool, by(AGEGRP SEX)
iii. tabstat yrschool if SEX==1, by(AGEGRP)
iv. twoway (line yrschool AGEGRP if SEX==1, clcolor(red)) (line yrschool AGEGRP if SEX==2, clcolor(blue) ytitle("Average years of education") xtitle("Age group"))
v. restore
vi. Note: you can graph the data either in STATA or copy the results from the table over to Excel or another spreadsheet package and use it
4. Construct a similar graph showing the percentage of the cohort that has achieved a particular highest level of education (your choice) - Figure 2
5. Briefly describe the trends you see in your graphs.
6. Run two Mincerian log income regressions using only education, experience and experience squared, one for men and one for women.
You will need to first create an (approximate) experience variable (equal to age, less years of schooling, less 6), and then experience squared
7. Run a wage regression including some additional controls (your choice). Explain why you chose these.
8. Run a wage regression that might help to identify nonlinearities in the return to education. Is there any evidence of this?
9. Report the results of the wage regressions you have done in a table similar to those in economics journal articles - Table 3
Note: your table should have 6 columns - three regressions each for men and women.
10. Interpret the coefficient on the education variable in the basic regression specification, for both men and women. What sorts of problems are there in interpreting this figure as a return to education?
11. What is your estimated effect of another year of experience? To do this, draw a graph showing the marginal effect of an additional year of experience on income (for years of experience ranging from 1 to 40 years), with separate lines for women and men
Note: you should do this in a separate spreadsheet; write out your estimating equation very clearly before you try it.
12. Interpret the coefficient on the extra variable you included in your regression. What do you think this means?
13. Was there any evidence of non-linearity in the returns to education?
14. Overall, how do your estimates look compared with other estimates of the Mincerian log wage regression we have looked at?
15. There are arguments that this type of regression will not provide a good estimate of the causal effects of an additional year of education on income. What is the key reason for this? What sorts of methods have applied economists used to try to estimate a causal effect of education on income? Have they shown very different results, in general? Discuss briefly.
Assignment 2: Common comments.
Describe chart. Also including a title is good practice.
Significant digits
"Holding all other factors constant" - what other factors exactly are you holding constant, here? (Answer: none, so it is just a correlation.)
Men and women vs male and female. (Please use men and women as nouns. BTW, also don't call me "miss")
Did not do correct regression for men&women together. (interaction between female dummy and yrs of ed missed.)
Potential experience ~= age (see do file).
Prediction incorrect (see spreadsheet).
Mincerian log wage regression?lnY = a + b1ED + b2EXP + b3EXP^2
Prob with max?
Note on Total Income:
https://odesi1.scholarsportal.info/webview/index/en/Odesi/ODESI-Click-to-View-Categories-.d.6/Social-Surveys.d.30/CANADA.d.31/National-Household-Survey-NHS-.d.1315/2011.d.1318/Public-Use-Microdata-File-PUMF-.d.1417/National-Household-Survey-2011-Canada-Public-Use-Microdata-File-PUMF-Individuals-File.s.NHS-99M001X-E-2011-pumf-individuals/Income.h.13/Income-Total-income/fVariable/NHS-99M001X-E-2011-pumf-individuals_V117
"The value 8,888,888 stands for not available. The value 9,999,999 stands for not applicable and is applied to all persons aged less than 15 years. Otherwise, this variable could be positive, negative or zero and is a rounded value of the amount received by the individual in 2010. Values that would have been rounded to zero have been replaced by 1. In some cases, high values have been top coded and low values have been bottom coded in this file."
From what I can tell, they have pretty much done what I guessed - they seem to have top coded the value of total income to approximately the top percentile cutoff by province and sex. (Eg for BC 1.15% of women are recorded as earning 235725.) There is also some grouping of incomes evident for the 98-99th percentile (incomes are recorded to the nearest $'000 or $'0,000, except for the top 2%, which are recorded to $10 or $5). Likely that any time there would be only 1-5 people at a particular income, they rounded.
This course is about economics of education, the assignment may use STATA13.
https://goo.gl/pExOOU
Attachment:- Assignment.rar