Create an intermediary report focused on model building

Assignment Help Other Subject
Reference no: EM131916370

Lab: Reducing Crime

Introduction - Your team has been hired to provide research for a political campaign. They have obtained a dataset of crime statistics for a selection of counties in North Carolina.

Your task is to examine the data to help the campaign understand the determinants of crime and to generate policy suggestions that are applicable to local government.

You may work in a team of up to 3 students. This is not a requirement, but we strongly encourage you to form a group and believe it will add considerable value to the exercise.

When working in a group, do not use a "division-of-labor" approach to complete the lab. All students should participate in all aspects of the final report.

Timeline - The lab takes place over three weeks, with a deliverable due each week.

Stage 1: Draft Report. You will create an intermediary report focused on model building but without statistical inference (no standard errors).

Stage 2: Peer Feedback. Teams will exchange reports and provide each other with feedback.

Stage 3: Final Report. You will create a final report, which includes a complete assessment of the classical linear model assumptions, standard errors, and other elements of statistical inference.

The Data - The data is provided in a file, crime.csv. It was first used in a study by Cornwell and Trumball, researchers from the University of Georgia and West Virginia University.

Stage 1: Draft Report -

In the first stage of the project, you will create a draft report that addresses the concerns of the political campaign. Your report will include a model building process, culminating in a well formatted regression table that displays a minimum of three model specifications. In fact, your draft report will be very similar in structure to your final report, but won't include standard errors or a full assessment of the classical linear model assumptions, which we will cover in units 12 and 13.

Here are some things to keep in mind during your model building process:

1. What do you want to measure? Make sure you identify variables that will be relevant to the concerns of the political campaign.

2. What transformations should you apply to each variable? This is very important because transformations can reveal linearities in the data, make our results relevant, or help us meet model assumptions.

3. Are your choices supported by EDA? You will likely start with some general EDA to detect anomalies (missing values, top-coded variables, etc.). From then on, your EDA should be interspersed with your model building. Use visual tools to guide your decisions.

4. What covariates help you identify a causal effect? What covariates are problematic, either due to multicollinearity, or because they will absorb some of a causal effect you want to measure?

At the same time, it is important to remember that you are not trying to create one perfect model. You will create several specifications, giving the reader a sense of how robust your results are (how sensitive to modeling choices), and to show that you're not just cherry-picking the specification that leads to the largest effects.

At a minimum, you should include the following three specifications:

  • One model with only the explanatory variables of key interest (possibly transformed, as determined by your EDA), and no other covariates.
  • One model that includes key explanatory variables and only covariates that you believe increase the accuracy of your results without introducing substantial bias (for example, you should not include outcome variables that will absorb some of the causal effect you are interested in). This model should strike a balance between accuracy and parsimony and reflect your best understanding of the determinants of crime.
  • One model that includes the previous covariates, and most, if not all, other covariates. A key purpose of this model is to demonstrate the robustness of your results to model specification.

Guided by your background knowledge and your EDA, other specifications may make sense. You are trying to choose points that encircle the space of reasonable modeling choices, to give an overall understanding of how these choices impact results.

You will display all of your model specifications in a regression table, using a package like stargazer to format your output. It should be easy for the reader to find the coefficients that represent key effects near the top of the regression table, and scan horizontally to see how they change from specification to specification. Since we won't cover inference for linear regression until unit 12, you should not display any standard errors at this point. You should also avoid conducting statistical tests for now (but please do point out what tests you think would be valuable).

After your model building process, you should include a substantial discussion of omitted variables. Identify what you think are the 5-10 most important omitted variables that bias results you care about. For each variable, you should estimate what direction the bias is in. If you can argue whether the bias is large or small, that is even better. State whether you have any variables available that may proxy (even imperfectly) for the omitted variable. Pay particular attention to whether each omitted variable bias is towards zero or away from zero. You will use this information to judge whether the effects you find are likely to be real, or whether they might be entirely an artifact of omitted variable bias.

Do not use techniques that are not covered in this course, unless you have obtained prior approval.

Stage 2: Peer Feedback -

In Stage 2, you will provide feedback on another team's draft report. We will ask you to comment separately on different sections. The following list is very similar to the rubric we will use when grading your final report.

1. Introduction. As you understand it, what is the motivation for this team's report? Does the introduction as written make the motivation easy to understand? Is the analysis well-motivated? Note that we're not necessarily expecting a long introduction. Even a single paragraph is probably enough for most reports.

2. The Initial EDA. Is the EDA presented in a systematic and transparent way? Did the team notice any anomalous values? Is there a sufficient justification for any data-points that are removed? Did the report note any coding features that affect the meaning of variables (e.g. top-coding or bottom-coding)?  Can you identify anything the team could do to improve its understanding or treatment of the data?

3. The Model Building Process. Overall, is each step in the model building process supported by EDA? Is the outcome variable (or variables) appropriate? Did the team consider available variable transformations and select them with an eye towards model plausibility and interperability? Are transformations used to expose linear relationships in scatter-plots? Is there enough explanation in the text to understand the meaning of each visualization?

4. The Regression Table. Are the model specifications properly chosen to outline the boundary of reasonable choices? Is it easy to find key coefficients in the regression table? Does the text include a discussion of practical significance for key effects?

5. The Omitted Variables Discussion. Did the report miss any important sources of omitted variable bias? For each omitted variable, is there a complete discussion of the direction of bias? Are the estimated directions of bias correct? Does the team consider possible proxy variables, and if so do you find these choices plausible? Is the discussion of omitted variables linked back to the presentation of main results? In other words, does the team adequately re-evaluate their estimated effects in light of the sources of bias?

6. Conclusion. Does the conclusion address the big-picture concerns that would be at the center of a political campaign? Does it raise interesting points beyond numerical estimates? Does it place relevant context around the results?

7. Throughout the report, do you find any errors, faulty logic, unclear or unpersuasive writing, or other elements that leave you less convinced by the conclusions?

Please be thorough and read the report critically, actively trying to find weaknesses. Your comments will directly help your peers get the most value out of the project.

Stage 3: Final Report -

In the final stage of the project, you will incorporate the feedback you receive, and use what you've learned about OLS inference to create a final report.

One of the most important tasks at this stage is to add valid standard errors to your regression table.

In a new section of the report, please choose one of your most important model specifications, and present a detailed assessment of all 6 classical linear model assumptions. Use plots and other diagnostic tools to assess whether the assumptions appear to be violated, and follow best practices in responding to any violations you find. Note that we only want to see this level of detail for one model specification.

For the other specifications, you should also conduct a full assessment of the CLM assumptions, but only highlight major surprises that you notice in your text.

Note that you may need to change your model specifications in response to violations of the CLM. At this point, you should also consider whether changes are appropriate to decrease standard errors for your estimates. These decisions involve tradeoffs and you should strive to be transparent about them in your report.

Note also that you may need to adjust your conclusions in response to statistical significance. Make sure that you discuss both statistical and practical significance for your key effects of interest.

You may want to include statistical tests besides the standard t-tests for regression coefficients.

We will assess your final report using a rubric that includes the elements listed above. We will also consider whether you have correctly included elements of statistical inference in your report. In particular, we will look to see whether you have correctly assessed the CLM assumptions and whether you have responded appropriately to any violations.

Attachment:- Assignment Files.rar

Reference no: EM131916370

Questions Cloud

How should chapman have responded to colwell request : Jeanette Colwell was hired as a part-time retail clerk at a Rite Aid store in Old Forge, Pennsylvania. She worked various shifts, including 5 pm to 9 pm.
Calculate cost of equity using dividend growth model method : Calculate the cost of equity using the SML method. Calculate the cost of equity using the dividend growth model method.
Should the adea protect against age discrimination : Being young is cool, but younger workers may get cold treatment in the workplace. Younger workers often seem to take the brunt of layoffs.
What was the total dollar annual cost of the revolver : Blackhawk Inc. arranged a $10,000,000 revolving credit agreement with a group of banks. what was the total dollar annual cost of the revolver?
Create an intermediary report focused on model building : Lab: Reducing Crime - Draft Report. You will create an intermediary report focused on model building but without statistical inference (no standard errors)
Explain both the plaintiff and defendant arguments : The plaintiff, who had 30 years of loss control experience, applied for the job but was not interviewed. ICNA hired a 28-year-old woman with no loss control.
Discuss how the exchange rates affect the economy : What is Probate? What is a Trust? Discuss how the exchange rates affect the economy.
How credible is the research you found related to problem : How credible is the research you found related to this problem? Is there generally a consensus about key aspects of problems or is there a lot of controversy?
How is workplace bullying defined : At this writing, the Healthy Workplace bill addressing workplace bullying has been introduced in a number of state legislatures. Using the Healthy Workplace.

Reviews

len1916370

3/27/2018 12:43:31 AM

Submission Requirements - Submit your lab via ISVC; please do not submit via email. Each group only needs to submit one report. That report must include two files: 1. A pdf file including the summary, the details of your analysis, and all the R code used to produce the analysis. Please do not suppress the code in your pdf file. The Rmd source file used to produce the pdf file. Be sure to include the names of all team members in your report. Place the word ‘draft’ in the file names. Page limit for Stage 1: 20 pages. Please observe these requirements carefully: we will deduct points from your score if they are not met.

len1916370

3/27/2018 12:43:13 AM

Please be thorough and read the report critically, actively trying to find weaknesses. Your comments will directly help your peers get the most value out of the project. Submission Requirements - The submission requirements are the same as for Stage 1, except for the following: Page limit for Stage 3: 30 pages. Place the word ‘final’ in the file names.

Write a Review

Other Subject Questions & Answers

  Cross-cultural opportunities and conflicts in canada

Short Paper on Cross-cultural Opportunities and Conflicts in Canada.

  Sociology theory questions

Sociology are very fundamental in nature. Role strain and role constraint speak about the duties and responsibilities of the roles of people in society or in a group. A short theory about Darwin and Moths is also answered.

  A book review on unfaithful angels

This review will help the reader understand the social work profession through different concepts giving the glimpse of why the social work profession might have drifted away from its original purpose of serving the poor.

  Disorder paper: schizophrenia

Schizophrenia does not really have just one single cause. It is a possibility that this disorder could be inherited but not all doctors are sure.

  Individual assignment: two models handout and rubric

Individual Assignment : Two Models Handout and Rubric,    This paper will allow you to understand and evaluate two vastly different organizational models and to effectively communicate their differences.

  Developing strategic intent for toyota

The following report includes the description about the organization, its strategies, industry analysis in which it operates and its position in the industry.

  Gasoline powered passenger vehicles

In this study, we examine how gasoline price volatility and income of the consumers impacts consumer's demand for gasoline.

  An aspect of poverty in canada

Economics thesis undergrad 4th year paper to write. it should be about 22 pages in length, literature review, economic analysis and then data or cost benefit analysis.

  Ngn customer satisfaction qos indicator for 3g services

The paper aims to highlight the global trends in countries and regions where 3G has already been introduced and propose an implementation plan to the telecom operators of developing countries.

  Prepare a power point presentation

Prepare the power point presentation for the case: Santa Fe Independent School District

  Information literacy is important in this environment

Information literacy is critically important in this contemporary environment

  Associative property of multiplication

Write a definition for associative property of multiplication.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd