Data Cleaning-Preparation and Visualization Lab Assignment

Assignment Help Other Subject
Reference no: EM132448421

Lab - Data Cleaning/Preparation and Visualization

Stats 10: Introduction to Statistical Reasoning

Objectives -

1. Understand logical statements and subsetting.

2. Reinforce knowledge on visualization techniques.

Exercise 1 - We will be working with lead and copper data obtained from the residents of Flint, Michigan from January-February, 2017. Data are reported in PPB (parts per billion, or µg/L) from each residential testing kit. Remember that "Pb" denotes lead, and "Cu" denotes copper.

a. Download the data from CCLE and read it into R. When you read in the data, name your object "flint".

b. The EPA states a water source is especially dangerous if the lead level is 15 PPB or greater. What proportion of the locations tested were found to have dangerous lead levels?

c. Report the mean copper level for only test sites in the North region.

d. Report the mean copper level for only test sites with dangerous lead levels (at least 15 PPB).

e. Report the mean lead and copper levels.

f. Create a box plot with a good title for the lead levels.

g. Based on what you see in part (f), does the mean seem to be a good measure of center for the data? Report a more useful statistic for this data.

Exercise 2 - The data here represent life expectancies (Life) and per capita income (Income) in 1974 dollars for 101 countries in the early 1970's. The source of these data is: Leinhardt and Wasserman (1979), New York Times (September, 28, 1975, p. E-3). They also appear on Regression Analysis by Ashish Sen and Muni Srivastava. You can access these data in R using: life <- read.table

a. Construct a scatterplot of Life against Income. Note: Income should be on the horizontal axis. How does income appear to affect life expectancy?

b. Construct the boxplot and histogram of Income. Are there any outliers?

c. Split the data set into two parts: One for which the Income is strictly below $1000, and one for which the Income is at least $1000. Come up with your own names for these two objects.

d. Use the data for which the Income is below $1000. Plot Life against Income and compute the correlation coefficient. Hint: use the function cor().

Exercise 3 - Use R to access the Maas river data. These data contain the concentration of lead and zinc in ppm at 155 locations at the banks of the Maas river in the Netherlands. You can read the data in R as follows: maas <- read.table

a. Compute the summary statistics for lead and zinc using the summary() function.

b. Plot two histograms: one of lead and one of log(lead).

c. Plot log(lead) against log(zinc). What do you observe?

d. The level of risk for surface soil based on lead concentration in ppm is given on the table below:

Mean concentration (ppm) - Level of risk

Below 150 - Lead-free

Between 150-400 - Lead-safe

Above 400 - Significant environmental lead hazard

Use techniques similar to last lab to give different colors and sizes to the lead concentration at these 155 locations. You do not need to use the maps package create a map of the area. Just plot the points without a map.

Exercise 4 - The data for this exercise represent approximately the centers (given by longitude and latitude) of each one of the City of Los Angeles neighborhoods. See also the Los Angeles Times project on the City of Los Angeles neighborhoods. You can access these data at: LA <- read.table

a. Plot the data point locations. Use good formatting for the axes and title. Then add the outline of LA County by typing: map("county", "california", add = TRUE)

b. Do you see any relationship between income and school performance? Hint: Plot the variable Schools against the variable Income and describe what you see. Ignore the data points on the plot for which Schools = 0. Use what you learned about subsetting with logical statements to first create the objects you need for the scatter plot. Then, create the scatter plot. Alternate methods may only receive half credit.

Attachment:- Data CleaningPreparation and Visualization Assignment File.rar

Reference no: EM132448421

Questions Cloud

Improving the performance of project management : What advice would you offer to Adam Smith on improving the performance of project management and increasing project success rate?
Analyze what you see as the most ethical approach : Describe a situation in which non-maleficence might potentially come into conflict with a different precept in nursing. Write about the conflict you describe.
What considerations should be made by the nurse : A 10-year-old child named Elizabeth is brought into the emergency department by her mother. The mother appears anxious but sits quietly next to her daughter.
What criteria are used in discipline to evaluate alignment : Consider, for example, what criteria are used in your discipline (Health Science) to evaluate alignment of research components. And in what way will your.
Data Cleaning-Preparation and Visualization Lab Assignment : Lab - Data Cleaning/Preparation and Visualization. Construct a scatterplot of Life against Income. Construct the boxplot and histogram of Income
What are PGs primary revenue sources : What type of income statement format does P&G use? Indicate why this format might be used to present income statement information.
Discussion regarding americans and excess fat weight : Discussion regarding Americans and excess fat weight. Two-thirds of Americans are classified as overweight or obese. Discuss some reasons why so many Americans.
Define fragmented and consolidated industries : Define fragmented and consolidated industries. What are the differences between these two types of industries? How an industry can be consolidated?
Compute PGs gross profit for each of the years : Compute P&G's gross profit for each of the years 2012-2014. Explain why gross profit decreased in 2014.What financial ratios did P&G

Reviews

Write a Review

Other Subject Questions & Answers

  Cross-cultural opportunities and conflicts in canada

Short Paper on Cross-cultural Opportunities and Conflicts in Canada.

  Sociology theory questions

Sociology are very fundamental in nature. Role strain and role constraint speak about the duties and responsibilities of the roles of people in society or in a group. A short theory about Darwin and Moths is also answered.

  A book review on unfaithful angels

This review will help the reader understand the social work profession through different concepts giving the glimpse of why the social work profession might have drifted away from its original purpose of serving the poor.

  Disorder paper: schizophrenia

Schizophrenia does not really have just one single cause. It is a possibility that this disorder could be inherited but not all doctors are sure.

  Individual assignment: two models handout and rubric

Individual Assignment : Two Models Handout and Rubric,    This paper will allow you to understand and evaluate two vastly different organizational models and to effectively communicate their differences.

  Developing strategic intent for toyota

The following report includes the description about the organization, its strategies, industry analysis in which it operates and its position in the industry.

  Gasoline powered passenger vehicles

In this study, we examine how gasoline price volatility and income of the consumers impacts consumer's demand for gasoline.

  An aspect of poverty in canada

Economics thesis undergrad 4th year paper to write. it should be about 22 pages in length, literature review, economic analysis and then data or cost benefit analysis.

  Ngn customer satisfaction qos indicator for 3g services

The paper aims to highlight the global trends in countries and regions where 3G has already been introduced and propose an implementation plan to the telecom operators of developing countries.

  Prepare a power point presentation

Prepare the power point presentation for the case: Santa Fe Independent School District

  Information literacy is important in this environment

Information literacy is critically important in this contemporary environment

  Associative property of multiplication

Write a definition for associative property of multiplication.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd