Reference no: EM132864842
Roadmap: Confidence Intervals
This material presented in this week will assume you are on top of the following terms: parameter statistic sampling distribution of the mean standard error of the mean central limit theorem normality and how to check if your data follow a a normal distribution distribution
Learning Objective 1: At the of this week you will be able to compute a confidence interval for the population mean using ‘pen, paper and tables' and by using R.
It is important to work through the pen+paper+tables approach first as this will teach you important concepts and processes that underpin the computation of a confidence interval. It will also be much easier set arguments in R and interpret subsequent output. This is helpful, because in real life, you will use a statistical program to compute the confidence interval.
Learning Objective 2. In addition to one mean, your textbook shows you that you can compute a confidence interval for
Two Means Paired Means One Proportion Two Proportions One Variance Two Variances
Regression Slope and Intercept.
Although we will not study how to compute these confidence intervals, you should be able to explain in one or two sentences how to interpret these types of confidence intervals and when they would be useful to apply in practice.
1. Write a formula for computing a confidence interval for μ when the population standard deviation is unknown.
2. Write a sentence that explains each symbol in the confidence for μ.
3. Assuming that it would be appropriate to construct a confidence interval using the t-distribution
Exercise (pen and paper): Atmospheric Nitrogen Levels
Samples of Nitrogen (%) were captured from the resin in trees and used to represent the Nitrogen level in the atmosphere from the Cretaceous era some 90 million years. The mean of Nitrogen content of these 9 samples was 59.59% with a standard deviation of 6.26.
a. Compute a 99% confidence interval for the mean Nitrogen (%) from the Cretaceous period. Show all relevant working, this includes writing your formula and specifying the numerical value of each symbol in the formulae. (Hint: you will need to use the t-table below.) [Answer = (52.6%,66.6%)]
b. State any assumptions required to support the validity of the confidence interval.
c. Today, the atmosphere has around 78% Nitrogen. If the assumptions associated with the confidence interval procedure are valid, discuss how the Nitrogen levels from the Cretaceous era differ from today.
Exercise (R): Life Expectancy in non- OECD countries
This exercise demonstrates the workflow required to compute a confidence interval on real data. Follow the procedure carefully.
Dataset: Life_exp_non_OECD_subset.csv . Variables: male life expectancy.
Task: According to the OECD annual report in 2014, the life expectancy of OECD countries was 77.86 for males.
A simple random sample of non-OECD countries is given in the file Life_exp_non_OECD_subset.csv
which has been taken from the World Health Organisation.
Use this dataset to construct a 95% confidence interval for the population mean male life expectancy for non-OECD countries.
Interpret your findings.
Here are some steps to guide you through the process:
1. Load the data into R and check that your data have been loaded correctly.
A good tip, is to use the read.csv() command to import the data. This will make it easier for you to reproduce your results later.
If you are not sure of the syntax to import a file, you can cheat, and take a peak at the syntax used when you use the File - Import Dataset -Browse menu option in R studio.
2. Perform an exploratory data analysis to better understand your data.
3. Remove any outliers if appropriate.
4. Compute the sample mean and standard deviation.
5. Compute the t-values of the 95% confidence interval.
6. Compute the confidence interval. This can be done by using the formula y is the mean male life expectancy of the sample, s is the sample standard deviation n is the sample size df is the degrees of freedom which is equal to n - 1 s/√n is the equal to the standard error of the mean SE(y) is the t-quantile from the t-distribution. This value depends on the degrees of freedom and df the level of confidence in the confidence interval.
Once you understand how a confidence interval is calculated, you may find it more efficient to use the t. test() function to compute a confidence interval for μ.
7. Interpret your findings
8. Comment on the validity of your assumptions
Use R markdown to complete your assignment, then knit your R markdown file to word (pdf is OK to) and upload to LJCU. Please ensure your assignment contains the assignment questions, relevant working, comments, discussions, code and outputs.
PLEASE DO NOT ATTEMPT THIS ASSESSMENT UNTIL YOU HAVE COMPLETED ALL OF THIS ROADMAP
Dataset: skulls.csv .
Task: These data show the maximum breadths of samples of male Egyptian skulls from 4000 B.C. and 150 A.D. Do you think the breadth of skulls changed overtime? To help answer this question - produce 99% confidence intervals of skull breadth separately for each time period. Use the t.test() function to compute the confidence intervals.
Here is the workflow you should follow to answer this question.
1. Load the data into R and check that your data have been loaded correctly.
2. Perform an exploratory data analysis to better understand your data.
3. Remove any outliers if appropriate.
4. Compute the confidence intervals. Hint: use the t. test() function.
5. Interpret your findings
6. Comment on the validity of your assumptions.
Attachment:- Confidence Intervals.rar