Reference no: EM132221907 , Length: word count:2000
Assignment -
Project Topic: One of the topics I want research is: "violent crime rate in the United States, have they fallen sharply or increased in the past quarter century (1993-2018)? Today we have a number of influential factors like social networking platforms people have easy access today, where in the event of an incident they are notified immediately. If we think to 1993, people relied on landlines, newspapers, televisions and the lucky ones had a car phone or brick phone to communicate. I remember the cool and long-lasting Motorola StarTAC debut in 1996, those phones lasted FOREVER!
- Audience level: is for both technical and non-technical to understand the data being presented.
- Property crime has it truly decline over the long term? Using RColorBrewer (ROYGB Scatter), makes it easy to take advantage of one of R's great strengths: manipulating colors in plots, graphs, and maps. I like having the ability to use color especially when working different time series and having the color reflect sequential and qualitative color palettes, adds more power behind the story.
- Public perceptions about crime - does it truly align with the data? sunburstR (d3.js) chart: can be powerful when displaying over time how numbers shift.
- Geographic variations in crime types - city to city, states
- Map package: Leaflet GeoJSON map, offers a lightweight but powerful way to build interactive maps, I have used something similar in Tableau but never in R
Most crimes are not reported to police, therefore are not solved - for a variety of reasons, people feel it's a personal issue, the public don't see the police would not or could not do anything to them help.
Correlations and/or Standard Deviation: see if I can find US API connections to credit, income, weather or previous acts of violence to see if there is any connection.
Sources for data (here are a few I gathered) that can be used or explore others from the below list:
- Kaggle
- Gun violence
- School shootings
- Homicide Reports
Open Data Network: FBI Universal Crime Reporting
Section 1 -
1. Identify at least 3 different datasets and perform some initial exploration.
Potential Sources for Datasets to list:
- Kaggle
- Open Data Network
- American Community Survey
- Bureau of Labor Statistics
- Bureau of Economic Analysis
- Open Data Cincinnati
- Data.gov
- Healthdata.gov
- Amazon Web Services Datasets
- The General Society Survey
2. Put into practice what you have learned so far in the course:
a. Importing the data.
b. Identifying and reviewing the schema (codebook) (if available).
c. Identifying missing data.
d. Learn about the data visually (plotting) and numerically (descriptive stats).
3. Complete Section 1 of the Final Project Template and submit by the end of the week: Report and discuss all of your calculations and critiques using R Markdown.
- Original source where the data was obtained is cited and, if possible, hyperlinked.
- Source data is thoroughly explained (i.e. what was the original purpose of the data, when was it collected, how many variables did the original have, explain any peculiarities of the source data such as how missing values are recorded, or how data was imputed, etc.).
Section 2 -
- Explain how your analysis may help the consumer of your research findings (recall you target audience from Section 1).
- What types of plots and tables will help you to illustrate the findings to your research questions?
- What do you not know how to do right now that you need to learn to answer your research questions?
Section 3 -
- Data importing and cleaning steps are explained in the text and in the DataCamp exercises (tell me why you are doing the data cleaning activities that you perform) and follow a logical process.
- With a clean dataset, show what the final data set looks like. However, do not print off a data frame with 200+ rows; show me the data in the most condensed form possible.
- What do you not know how to do right now that you need to learn to import and cleanup your dataset?
Report and discuss all of your calculations and critiques using R Markdown.
Section 4 -
Discuss how you plan to uncover new information in the data that is not self-evident.
- What are different ways you could look at this data to answer the questions you want to answer?
- Do you plan to slice and dice the data in different ways, create new variables, or join separate data frames to create new summary information? Explain.
- How could you summarize your data to answer key questions?
- What types of plots and tables will help you to illustrate the findings to your questions? Ensure that all graph plots have axis titles, legend if necessary, scales are appropriate, appropriate geoms used, etc.).
- What do you not know how to do right now that you need to learn to answer your questions?
- Do you plan on incorporating any machine learning techniques to answer your research questions? Explain.
Some additional questions you may want to consider as you work through this section of the project:
1) What features could you filter on?
2) How could arranging your data in different ways help?
3) Can you reduce your data by selecting only certain variables?
4) Could creating new variables add new insights?
5) Could summary statistics at different categorical levels tell you more?
6) How can you incorporate the pipe (%>%) operator to make your code more efficient?
7) Report and discuss all of your calculations and critiques using R Markdown.
Section 5 -
- Overall, write a coherent narrative that tells a story with the data as you complete this section.
- Summarize the problem statement you addressed.
- Summarize how you addressed this problem statement (the data used and the methodology employed).
- Summarize the interesting insights that your analysis provided.
- Summarize the implications to the consumer (target audience) of your analysis.
- Discuss the limitations of your analysis and how you, or someone else, could improve or build on it.
- In addition, submit your completed Project using R Markdown or provide a link to where it can also be downloaded from and/or viewed.
Note - Need 2000 words report. Total five sections, need assistance with in RStudio, it's a project with milestones.
Attachment:- Assignment File.rar