Reference no: EM132363416
Problem Set
1 Instructions
The following two files are needed for this assessment:
The Roe2006Data provides the data that you will use to complete this assessment, while the Roe2006Codebook.txt provides the necessary "codebook" to make sense of the variables included in the CSV file.
2 Data Management
1. Import the data into R. After importing the data, use R to:
a) Count the number of observations (i.e. countries) in the dataset.
b) Count the number of observations with any missing data.
c) What percentage of observations have missing data?
2. Generate new variables. Please generate the following additional variables:
a) Generate a new variable named common equal to 1 if the lorigin variable equals "Common" and 0 otherwise.
b) Generate a new variable named stable equal to 1 if the military variable equals "Stable" and 0 otherwise.
3. Save the data. Use R to save your modified dataset as a CSV.
3 Describing the Data
1. Cross-tabulation. Use R to generate a cross-tabulation of the variables common and stable that you created in section 2.2. What proportion of "common" law countries where also "stable"?
2. Histogram. Generate a histogram for the variable gdp.
a) Is the distribution for gdp symmetric or skewed? Please explain your answer.
Data Analysis Essay
1 Instructions
The Roe2006Data provides the data that you will use to complete this assessment, while the Roe2006Codebook.txt provides the necessary "codebook" to make sense of the variables included in the CSV file.
2 Data Management
1. Import the data into R. After importing the data, use R to:
a) Count the number of observations (i.e. countries) in the dataset.
b) Count the number of observations with any missing data.
c) What percentage of observations have missing data?
2. Generate new variables. Please generate the following additional variables:
a) Generate a new variable named common equal to 1 if the lorigin variable equals "Common" and 0 otherwise.
b) Generate a new variable named stable equal to 1 if the military variable equals "Stable" and 0 otherwise.
3. Save the data. Use R to save your modified dataset as a CSV.
3 Describing the Data
1. Cross-tabulation. Use R to generate a cross-tabulation of the variables common and stable that you created in section 2.2. What proportion of "common" law countries where also "stable"?
2. Histogram. Generate a histogram for the variable gdp.
a) Is the distribution for gdp symmetric or skewed? Please explain your answer.
3. Statistics. Use R to produce the number of observations, the mean, the standard deviation, the minimum, and the maximum value for
each of the following variables: gdp, separation, laborpower, laborreg, common, and stable.
a) How would you interpret the mean and standard deviation for gdp?
4. Bivariate description.
a) Generate a scatter plot for the variables laborpower and gdp. Would you say that there is a positive relationship, negative relationship,
or no relationship?
b) Calculate the correlation between laborpower and gdp. How would you interpret this correlation?
Inference
1. Calculate the 95% confidence interval for gdp.
a) Let's say that we wanted to be more than 95% confident. Estimate the 99% confidence interval. Is the 99% confidence interval wider
or narrower than the interval described in part (b)?
b) Let's say that I thought that the true underlying population mean of gdp was $25,000. Does the data support this position? Please
explain.
5 Hypothesis Testing
1. One sample test. Let's say that we wanted to test the hypothesis that most nations were unstable during the WWII period. Use the
stable variable generated above to test this hypothesis at the 5% error level.
a) Please state the null and alternative hypothesis. Is this a one- or two-tailed test?
b) Can we reject the null hypothesis at α = 0.05? What about α = 0.01?
2. Two sample test. Let's say that we are interested in examining whether the average intensity of labor market regulation (i.e. the laborreg variable) differs across common and civil law countries.
a) Please state the null and alternative hypothesis. Is this a one- or two-tailed test?
b) Use common and laborreg to test this hypothesis at α = 0.05? Can we reject the null hypothesis?
Attachment:- CodeBook.rar