Reference no: EM133234682
Pick an inbuilt data set (you can view a list of inbuilt datasets by typing `data()` on R console) and perform Exploratory Data Analysis. Make sure the dataset has at least 2+ numeric and 1+ categorical variable. If your dataset does not have a categorical variable, you can define one based on the continuous variable (hint: one way to achieve that (Links to an external site.)). The objective of the analysis is to understand the data and communicate initial findings about the data in a written format. The analysis should meet the following criteria:
Perform checks to determine quality of the data (missing values, outliers, etc.)
Description of the data:
how big is it (number of observations, variables),
how many numeric variables,
how many categorical variables,
description of the variables, if available
Are there any missing values?
Any duplicate rows?
Compute summary statistics (mean, median, mode, standard deviation, variance, range).
Select one categorical variable, compute these statistics on a numeric variable by grouping on a categorical variable
Visualize and transform to answer the questions asked. Visualizations to illustrate:
Relationship between variables
Trend
Distribution of the variable(s)
Comparison of summary statistics across categories.