Compute mean of the sample means and standard deviation

Assignment Help Basic Computer Science
Reference no: EM132411335

Assignment

Part 1) Central Limit Theorem

The input data consists of the sequence from 11 to 20 (11:20). Show the following three plots in a single row.

a) Show the histogram of the densities of this distribution.
b) Using all samples of this data of size 2, show the histogram of the densities of the sample means.
c) Using all samples of this data of size 5, show the histogram of the densities of the sample means.
d) Compare of means and standard deviations of the above three distributions.

Part 2) Central Limit Theorem

The data in the file queries.csv contains the number of queries Google has had each day for a one year period (365 days).

a) Show the histogram of the distribution of the number of queries. Compute the mean and standard deviation of the number of queries Google has had per day.

b) Draw 1000 samples of this data of size 5, show the histogram of the densities of the sample means. Compute the mean of the sample means and the standard deviation of the sample means.

c) Draw 1000 samples of this data of size 20, show the histogram of the densities of the sample means. Compute the mean of the sample means and the standard deviation of the sample means.

d) Compare of means and standard deviations of the above three distributions.

Part 3) Central Limit Theorem - Negative Binomial distribution

Suppose the input data follows the negative binomial distribution with the parameters size = 5 and prob = 0.5.

a) Generate 1000 random numbers from this distribution. Show the barplot with the proportions of the distinct values of this distribution.

b) With samples sizes of 10, 20, 30, and 40, generate the data for 5000 samples using the same distribution. Show the histograms of the densities of the sample means. Use a 2 x 2 layout.

c) Compare of means and standard deviations of the data from a) with the four sequences generated in b).

Part 4) Sampling

Use the MU284 dataset from the sampling package. Use a sample size of 20 for each of the following.

a) Show the sample drawn using simple random sampling without replacement. Show the frequencies for each region (REG). Show the percentages of these with respect to the entire dataset.

b) Show the sample drawn using systematic sampling. Show the frequencies for each region (REG). Show the percentages of these with respect to the entire dataset.

c) Calculate the inclusion probabilities using the S82 variable. Using these values, show the sample drawn using systematic sampling. Show the frequencies for each region (REG). Show the percentages of these with respect to the entire dataset.

d) Order the data using the REG variable. Draw a stratified sample using proportional sizes based on the REG variable. Show the frequencies for each region (REG). Show the percentages of these with respect to the entire dataset.

e) Compare the means of RMT85 variable for these four samples with the entire data.

Attachment:- Central Limit Theorem.rar

Verified Expert

This paper demonstrates the Central Limit Theorem( CLT for short) applications in a real life scenario and how it help us to get around the problem to perform predictive modeling of large data set where the population is not normal.

Reference no: EM132411335

Questions Cloud

How do the practices impact consumers and the economy : How do these practices impact consumers, businesses (other than banks), and the economy. Give both a short run and long run answer.
What is statistical multiplexing : What are some similarities between neighborhood roads and LANs? What is statistical multiplexing? How is statistical multiplexing useful in WANs
What percentage of the population has been diagnosed : What percentage of the population has been diagnosed with this condition? What education can be provided to remove the stigma(s)?
Describe the growth you observed within your mentee : Describe the growth you observed within your mentee. Your mentee improve both personally, professionally, and toward to achievement of the mentee's goals?
Compute mean of the sample means and standard deviation : Compare of means and standard deviations of the data from with the four sequences - Calculate the inclusion probabilities using the S82 variable
Do you think the label was used inappropriately : Find a study published in a nursing journal in 2010 or earlier that is described a s a pilot study. Do you think the study really is a pilot study.
Compare difference between theory and practice in nursing : Compare the difference between theory, research, and practice in nursing. Choose a theory that best correlates with the EBP practice change that you would like.
Discussions frequently revolve around talk of sides : Why do discussions frequently revolve around talk of "sides"? Can attention be returned to serving the patients? How?
Which stage of the policy model does the scenario represent : Jeanne Blum, RN, is a nurse on a LDRP unit. Recently, the policy and procedures manual for Jeanne's unit included the premature rupturing of membranes.

Reviews

Write a Review

Basic Computer Science Questions & Answers

  Identifies the cost of computer

identifies the cost of computer components to configure a computer system (including all peripheral devices where needed) for use in one of the following four situations:

  Input devices

Compare how the gestures data is generated and represented for interpretation in each of the following input devices. In your comparison, consider the data formats (radio waves, electrical signal, sound, etc.), device drivers, operating systems suppo..

  Cores on computer systems

Assignment : Cores on Computer Systems:  Differentiate between multiprocessor systems and many-core systems in terms of power efficiency, cost benefit analysis, instructions processing efficiency, and packaging form factors.

  Prepare an annual budget in an excel spreadsheet

Prepare working solutions in Excel that will manage the annual budget

  Write a research paper in relation to a software design

Research paper in relation to a Software Design related topic

  Describe the forest, domain, ou, and trust configuration

Describe the forest, domain, OU, and trust configuration for Bluesky. Include a chart or diagram of the current configuration. Currently Bluesky has a single domain and default OU structure.

  Construct a truth table for the boolean expression

Construct a truth table for the Boolean expressions ABC + A'B'C' ABC + AB'C' + A'B'C' A(BC' + B'C)

  Evaluate the cost of materials

Evaluate the cost of materials

  The marie simulator

Depending on how comfortable you are with using the MARIE simulator after reading

  What is the main advantage of using master pages

What is the main advantage of using master pages. Explain the purpose and advantage of using styles.

  Describe the three fundamental models of distributed systems

Explain the two approaches to packet delivery by the network layer in Distributed Systems. Describe the three fundamental models of Distributed Systems

  Distinguish between caching and buffering

Distinguish between caching and buffering The failure model defines the ways in which failure may occur in order to provide an understanding of the effects of failure. Give one type of failure with a brief description of the failure

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd