Reference no: EM132233648
Assignment -
In this assignment, you will have to complete questions using exploration and data management techniques, as well as descriptive analyses, as learnt in class. The dataset used for the assignment concerns Mass Shootings in the US (Mass Shootings in America). It includes 307 cases of mass shootings as described in the news media, the information being collected from various electronic sources. Some cases go back as far as 1966.
Answer the questions in a separate document, starting with a title page with both names (if you work as a team) as well as including all relevant information such as tables and figures when indicated. When completed, upload your document on Brightspace by the deadline.
1. Briefly discuss (half page, single spaced) the limitations inherent to the sampling method used for this study, as they pertain to external validity: To what extent is it possible to generalize from this sample? What are the primary obstacles to doing so?
2. Report the prevalence of missing data for the variable linked to the age of the shooter (ShooterAges). Include raw frequency and percentage.
3. For the same variable, determine whether there is a pattern of missingness by documenting associations with race (Race), outcome for the shooter (Fate) and number of victims (TotalNumberofVictims), as these factors may influence the information that is reported in the media. For the first two variables, use the ones that are located at the end of the dataset.
4. What is the average age distribution (ShooterAges) for the shooters? Can we speak of a normal distribution in this case?
5. Using valid data (info is available), what is the proportion of such cases in which the shooter committed suicide (Killedself)?
6. What would be the best way to provide descriptive statistics on the variable number of firearms (TotalNumberofGuns), based on its distribution?
Attachment:- Assignment Files.rar