Calculate the sample mean and the sample variance

Assignment Help Basic Statistics
Reference no: EM132583633

Assignment - Practice Problems

Problem 1: Standardizing and Transforming

Part (a) The variable problem.1.a.data contains a sample of values drawn from a normal distribution with known expected value E[X] = 57.3 and variance Var[X] = 495.2. Perform a standardization transformation to generate a new vector consisting of values that are distributed as a standard normal distribution. Calculate the sample mean and the sample variance, and report both values using separate cat() statements, rounding to 5 decimal places.

Part (b) Create a histogram of the standarized values you created in part (a). Then superimpose a standard normal density curve over this graph. (Note: the sample is much smaller than what we usually generate in our simulations, so this graph will be much noisier.)

Part (c) The vector problem.1.c.data contains a sample of values from a standard normal random distribution. Perform a general normal transformation to generate a new vector consisting of values from a general normal distribution with expected value E[X] = -143.2 and variance Var[X] = 791.6. Calculate the sample mean and the sample variance, and report both values using separate cat() statements, rounding to 5 decimal places.

Part (d) Create a histogram of the transformed values you created in part (c). Then superimpose a density curve for the general normal distribution in part (c) over this graph. (Note: the sample is much smaller than what we usually generate in our simulations, so this graph will be much noisier.)

Problem 2: Method of Moments for a Weibull Distribution

A Weibull distribution with scale parameter θ and shape parameter τ has the density function:

f(x) = τ · (x/θ)τ · exp{-(x/θ)τ}/x, x > 0

Then the expected value of X is:

E[X] = θ · Γ(1 + 1/τ)

Using the first moment, the method-of-moments estimator is:

θ^ = X-/Γ(1 + 1/τ)

In the special case where τ has the known value τ = 2, the density reduces to:

f(x) = 2 · (x/θ)2 · exp{-(x/θ)2}/x, x > 0

Also, the expected value is:

E[X] = θ · Γ(1 + 1/2)

Then the method of moments estimator is:

θ^ = X-/Γ(1 + 1/2)

Part (a) The variable problem.2.data contains data from a Weibull distribution with known shape parameter τ = 2 and unknown scale parameter θ. Calculate the method-of-moments estimator for the scale parameter θ. Use the built-in R function gamma() to calculate the value of the gamma function in the denominator. Report your result using a cat() statement, rounding to 5 decimal places.

Part (b) Construct a histogram of the data in problem.2.data. Then superimpose the density curve for a Weibull distribution using the built-in R function dweibull(), using your parameter estimate from part (a) as the scale parameter θ and a known shape parameter τ = 2.

Problem 3: Mean Squared Error

Obi is measuring an enzyme level in his laboratory. The true value (unknown to him) is µ = 100. He has three measuring devices:

The first measuring device, denoted W, produces measurements that have an expected value of E[W] = 90 and a variance of Var[W] = 100.

The second measuring device, denoted X, produces measurements that have an expected value of E[X] = 100 and a variance of Var[X] = 500.

The third measuring device, denoted Y, produces measurements that have an expected value E[Y] = 95 and a variance of Var[Y] = 350.

For each device, all measurements are independent of one another.

Part (a) Obi decides to take one measurement and then uses this to estimate the enzyme level.

What is the mean squared error (MSE) when he uses 1 measurement from W?

What is the MSE when he uses 1 measurement from X?

What is the MSE when he uses 1 measurement from Y?

Which estimator gives the best estimate of the true enzyme level?

Part (b) Next, Obi decides to takes 5 independent measurements and then use their average as the estimate of the true enzyme level.

What is the MSE when he averages 5 measurements from W?

What is the MSE when he averages 5 measurements from X?

What is the MSE when he averages 5 measurements from Y?

Which estimator gives the best estimate of the true enzyme level?

Part (c) Finally, Obi decides to takes 10 independent measurements and then use their average as the estimate of the true enzyme level.

What is the MSE when he averages 10 measurements from W?

What is the MSE when he averages 10 measurements from X?

What is the MSE when he averages 10 measurements from Y?

Which estimator gives the best estimate of the true enzyme level?

Problem 4: Sampling Distribution of Sample Minimum

So far in MATH E-156, we've mainly been focused on the sampling distribution of the sample mean, although we've explored the sample median for normal distributions and the sample maximum for uniform distributions. Now let's investigate a remarkable result for exponential distributions:

Suppose X1, X2, . . . , Xn are independent random variables that are all exponentially distributed with rate parameter λ. Then the sample minimum is also exponentially distributed, with rate parameter nλ.

Part (a) Let's work through a simple example. Suppose we draw samples of size n = 8 from an exponential distribution with rate parameter λ = 1.5, and we calculate the sample minimum of this sample.

What is the distribution of this sample minimum? Give the name of the distribution, along with the numerical value of any parameters.

What is the expected value of this sample minimum random variable?

What is the variance of this sample minimum random variable?

Report the distribution of the sample with one or two sentences. Report the expected value and variance of the sample minimum using separate cat() statements for each value, rounding to 5 decimal places.

Part (b) Construct a simulation that generates random sample minimums from an exponential distribution with rate parameter λ = 1.5, for samples of size n = 8:

For each iteration of your for loop, draw a sample of size n = 8 from an exponential distribution with rate parameter λ = 1.5.

Calculate the sample minimum of this sample, and then store this value in an outcome vector.

When the simulation is done, your outcome.vector will be populated with random sample minimums. Then report the sample mean and sample variance of the outcome.vector using a separate cat() statement for each, rounding to 5 decimal places.

Part (c) Construct a histogram of the random values you generated in part (b). Then superimpose a density curve using the distribution you specified in part (a).

Part (d) The vector problem.4.data contains data from an exponential distribution with a rate parameter of λ. Use a method-of-moments estimator to estimate the rate parameter of the distribution of the sample minimums. Report your result using a cat() statement, rounding to 5 decimal places.

Part (e) As a check on your work in part (c), construct a histogram of the values in problem.4.data. Then superimpose the density curve for the distribution that you estimated in part (d).

Part (f) In part (d), you estimated the rate parameter λ for the data in the variable problem.4.data. In fact, I generated this data by first drawing random samples of size n = 8 from an exponential distribution with rate parameter θ, and then calculating the sample minimum. Use your estimate from part (d) to estimate the value of the rate parameter θ. (Hint: this is very easy, and requires one line of code, if that; don't overthink this.)

Problem 5: Constructing a One-Sample Test

Marie is a field biologist who is studying armadillos, and she is wondering if the local armadillo population has on average a different weight than normal, although she doesn't know if it's higher or lower. She knows that armadillo weights are normally distributed and also that the variance of armadillo weights is Var[X] = 10000, but she's not sure about the expected value. The standard weight of armadillos is µ = 4500 grams, and Marie would like strong evidence before she rules out this standard value. She draws a sample of size n = 27 from the population, and then using the sample mean as her test statistic she performs a two-sided test of the null hypothesis. She calibrates the test so that the probability of a Type I error rate is 5%.

Part (a) What is the null hypothesis for this test? State the null hypothesis in a sentence. Then define a variable to store the expected value of the distribution, given that this null hypothesis is true.

Part (b) What is the significance level of the hypothesis test? Define a variable to store this signficance level, and report your result using a cat() statement, rounding to 5 decimal places.

Part (c) What is the variance of the test statistic? Report your result using a cat() statement, rounding to 5 decimal places.

Part (d) What is the lower critical value for this test? Report your result using a cat() statement, rounding to 5 decimal places.

Part (e) What is the upper critical value for this test? Report your result using a cat() statement, rounding to 5 decimal places.

Part (f) Construct a simulation to show that the upper and lower critical values that you calculated in parts (d) and (e) have the correct tail probabilities. For each iteration of the for loop, the simulation should first draw a sample of size n = 27 from the probability distribution under the null hypothesis that you defined in part (a), then calculate the sample mean, and finally store this in an outcome.vector. When the simulation has finished, the outcome.vector will consist of random sample means of samples of size n = 27. Perform vectorized operations on this outcome.vector to show that the proportion of values that are less than the lower critical value is correct, given the significance level that you defined in part (b). Finally, do the same for the upper critical value. Report each result separately using a cat() statement, rounding to 5 decimal places.

Part (g) Draw a graph of the sampling distribution of the test statistic under the null hypothesis. Include vertical lines indicating the lower and upper critical values, and shade the graph under the rejection region.

Draw a graph of the sampling distribution of the test statistic under the null hypothesis. Include vertical lines indicating the lower and upper critical values, and shade the graph under the rejection region.

Problem 6: Conducting the One-Sample Hypothesis Test

Part (a) Use the data in the variable problem.6.data to calculate the observed value of the test statistic.

Part (b) Based on the observed value of the test statistic that you calculated in part (e), do you think that this data constitutes strong evidence against the null hypothesis? Explain your answer with one or two sentences.

Part (c) Now we'll construct a 90% confidence interval for the population expected value. For this part, calculate the lower endpoint of this confidence interval, given the information in problem 5 and the observed value of the test statistic. Report your result using a cat() statement.

Part (d) We continue with our construction of a 90% confidence interval for the population expected value. For this part, calculate the upper endpoint of this confidence interval, given the information in problem 5 and the observed value of the test statistic. Report your result using a cat() statement.

Part (e) Using the confidence interval that you calculated in parts (c) and (d), perform a test of the null hypothesis you defined in Problem 7, part (a). Report your conclusion and explain your reasoning using a few sentences.

Part (f) Using the distribution of the null hypothesis that you defined in Problem 7, along with the observed value of the test statistic from part (a), calculate the p-value for this data. Report your result using a cat() statement.

Part (g) Using your result from part (f), perform a test of the null hypothesis. Report your conclusion using one or two sentences.

Part (h) At the beginning of problem 5, the problem statement indicated that Marie constructed her test so that the probability of a Type I error is 5%. Suppose Marie decides that's insufficiently stringent, and instead wants to conduct her test with a Type I error rate of 1%. What would her conclusion be now? You can answer this question with just a few sentences; do not perform any further R calculations. (Hint: think about part (f).)

Problem 7: Constructing a Two-Sample Test

Tyrone is conducting an experiment to compare a new agricultural fertilizers with the standard fertilizer. First, nX = 100 plants are treated with the standard fertilizer. Then another set of nY = 100 plants are treated with the new fertilizer.

For the plants treated with the standard fertilizer, the crop yields are normally distributed, with an unknown expected value of µX and a known variance of σX = 1500.

For the plants treated with the new fertilizer, the crop yields are normally distributed, with an unknown expected value µY and a known variance of σY = 1750.

All of the crop yields are independent of one another.

Tyrone wants to show that there is a difference in the crop yields between the two fertilizers. Therefore, he wants to falsify the hypothesis that the two unknown expected values µX and µY are equal.

Part (a) Let ? = µY - µX denote the difference in the expected values of the crop yields between the two fertilizers. Tyrone wants to perform a two-sided test to demonstrate that there is a non-zero difference in the true expected values of the crop yields. What should he use as the null hypothesis for this test?

Part (b) To test the null hypothesis in part (a), Tyrone decides to use the observed difference of the sample means D = Y- - X- as the test statistic. What is the expected value of this test statistic under the null hypothesis? Explain your answer with one or two sentences.

Part (c) What is the variance of the test statistic D = Y- - X-? Report your result using a cat() statement, rounding to 5 decimal places.

Part (d) The experimenters want to design their experiment so that it has a significance level of α = 0.10. Using your answers from parts (b) and (c), determine the upper critical value U that will insure that the two-sided test will have this appropriate Type I error rate.

Part (e) Tyrone wants to design his experiment so that it has a significance level of α = 0.10. Using your answers from parts (b) and (c), determine the upper critical value U that will insure that the two-sided test will have this appropriate Type I error rate.

Part (g) Draw a picture of the sampling distribution of the test statistic D. Draw the density curve of the distribution using a solid line, and indicate the lower and upper critical values using vertical lines with text annotation. Shade under the curve for the rejection region. Finally, be sure you include a main title, as well as axis titles.

Problem 8: Conducting the Two-Sample Test

Part (a) Calculate the sample mean of the variable problem.8.x.data. Calculate the sample mean of the variable problem.8.y.data. Then use these two sample means to calculate the test statistic D. Report your final result using a cat() statement.

Part (b) Does the observed value of the test statistic in part (a) constitute strong evidence against the null hypothesis, given the pre-specified significance level? Explain your answer with one or two sentences.

Part (c) Calculate a two-sided 90% confidence interval for the true difference ?. Report the lower and upper endpoints of this confidence interval using separate cat() statements, rounding to 5 decimal places.

Part (d) Using the confidence interval you calculated in part (c), perform a test of the null hypothesis of no difference in expected crop yields between the two fertilizers. Report your result with one or two sentences.

Part (e) Calculate the two-sided p-value for this observed data. Report your result using a cat() statement, rounding to 5 decimal places.

Part (f) Using your result from part (e), perform a two-sided hypothesis test at the α = 0.05 level. Report your conclusion with one or two sentences.

Reference no: EM132583633

Questions Cloud

Methods for building IT security culture within organization : Building on your case project, you will research best practices in IT governance and methods for building an IT security culture within the organization.
Arts more approachable to wider audience : How can we make art music and the arts more approachable to a wider audience? Why is art music not as popular as pop music?
Health care related to patient health : Research advancements that have been made in health care related to patient health.
About stylistic diversity and strong musical personality : Essay needed about Igor Stravinsky and his music. Talk about his stylistic diversity and strong musical personality please.
Calculate the sample mean and the sample variance : Calculate the sample mean and the sample variance, and report both values using separate cat() statements, rounding to 5 decimal places
Write a brief memorandum explaining the concept : Write a brief memorandum explaining the concept of Total Quality Management describing how this approach would apply to the case study found on Form 18
Describe behavior of your number one targeted segment : Describe a behavior of your number one targeted segment. How would you facilitate your communications?
Separation of duties : Discuss importance of separation of duties for personnel. Describe reasons why separation of duties is a critical requirement for policy framework compliance
Should buck present the borrowing and payment activity : Should Buck present the borrowing and payment activity related to its revolving line of credit as cash flows from operating, investing, or financing activities

Reviews

Write a Review

Basic Statistics Questions & Answers

  Provide the fan with the estimate using 90% confidence

A baseball fan would like to estimate the mean time it takes to complete a major league baseball game. He randomly selects 40 game lengths (times to complete games) for major league baseball games and finds the mean to be 178 minutes with a standard ..

  Below is the sample of baby deliveries in an 8 hour period

question below is the sample of baby deliveries in an 8 hour period at a small hospital13 14 6 12 13 8 6 14 4 78 7 5 9

  Give a mean mileage of greater

A car salesmen claims that a particular make of a car would give a mean mileage of greater than 20mpg. If I ran 10 cars on one gallon of gas each and came up with 10 different results, what statistical formula would I use to support the salesmen c..

  Determine the mean work week has increased

Suppose a study was done to determine if the mean work week has increased. 81 women were surveyed with the following results.

  At a 05 level of significance does the data from the survey

in 1997 frequent flyer magazine reported that 71 of frequent fliers surveyed complained that they had difficulty

  Find cramers index of association

Table shows the tabulated frequencies based on responses to both questions. Does response speed influence overall satisfaction? Use α= 0.01. Find Cramer's index of association.

  Construct a scatter plot of the number of reported problems

Construct a scatter plot of the number of reported problems per 100 vehicles as a function of the year.

  Does the law of average relate to the answer

20 turn out to be over 65 inches tall, you get $20. Which is better: a sample of size 100 or a sample of size of 1,000? Choose one and explain. Does the law of average relate to the answer you give?

  Find the intercept and slope and write the regression

question a local tire dealer wants to predict the number of tires sold each month. he believes that the number of tires

  Possible value for the correlation

Which is a possible value for the correlation of X and Y in the a) -0.75 b) 0 c) +1 d) +0.98

  Questionnaire to measure individuals aggressiveness

A psychologist has designed a questionnaire to measure individuals' aggressiveness. Suppose that the scores on the questionnaire are normally distributed

  Encoder play in coding quality

Coding quality is your responsibility as coding supervisor. Lately you have been doing a lot of coding audits and have recognized some patterns in the coding errors that you have identified. You have found 3 problem areas where coders are consiste..

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd