Plot the histogram of the samples

Assignment Help Other Subject

Reference no: EM132332201

Part 1: Law of Large Number

The first part of the empirical analysis serve the purpose to get you familiarize the LLN. LLN characterize the following fact:

Suppose {X_n} is a sequence of random variables draw from an underlying population space, then the sample mean of {X_n} defined as X^-_n = 1/n _i=1∑ⁿ X_i, converge in Probability or Almost Surely to the population mean of that underlying population space.

(a) Randomly draw 10000000 samples from N(4, 16), plot the histogram of the samples and underlying population distribution together. What is the theoretical expectation for the population space? Variance? And Standard Deviation?

Let sample size n equal to 1, 2, 4, 8, 16, ··· , 2²² = 4194304, 2²³ = 8388608 respectively. Randomly choose n samples from the original 10000000 sample space. (This is for speed consideration, you could draw n samples from N(4, 16) directly each time, but that would cost a lot of time), compute the sample mean X^-_n for each n.

Plot the sample mean with each sample size together with the population mean. What is your conclusion from the graph.

Change the parameters of the underlying distribution N(4, 16) to whatever you like, follow the same procedure again. Can you draw the same conclusion?

(b) Do the same thing as question (1) asked for Binomial Distribution B(50, 0.4) as the underlying population distribution.

(c) Do the same thing as question (1) asked for t-Distribution t(10) as the underlying population distribution. When you change your own parameters as the last question asked, make sure the degree of freedom be greater than 2, otherwise the sample mean may not converge, think about why?

(d) Do the same thing as question (1) asked for F-Distribution F(9,7) as the underlying population distribution. When you change your own parameters as the last question asked, make sure the degree of freedom df2 be greater than 4, otherwise the sample mean may not converge, think about why?

Part 2: Central Limit Theorem

The second part of the empirical analysis serve the purpose to get you familiarize the CLT. CLT characterize the following fact:

Suppose {X_n} is a sequence of random variables draw from an underlying population space, then the sample mean of {X_n} defined as X^-_n = 1/n _i=1∑ⁿX_i, converge in distribution to Normal Distribution N(µ, σ²), where µ = E[X_i], σ² = Var[Xi]/n.

(a) Randomly draw 10000000 samples from N(20,25), plot the histogram of the samples and underlying population distribution together.

What is the theoretical expectation for the population space? Variance? And Standard Deviation?

Let sample size n equal to 10 and 100000 respectively. We want to conduct the following experiment on both large sample cases (n = 100000) and small sample cases (n = 10).

Repeat the following process for 2000 times: Randomly choose n samples from the original 10000000 sample space. (This is for speed consideration, you could draw n samples from N(20,25) directly each time, but that would cost a lot of time), compute the sample mean X^-_n for each n.

As a result, we will have 2000 X^-₁₀ and 2000 X^-₁₀₀₀₀₀, write down the first and last 10 of them respectively.

Plot the histogram of 2000 sample mean X^-₁₀ together with N(20, 25/10 ). What is your conclusion from the graph.

Plot the histogram of 2000 sample mean X^-₁₀₀₀₀₀ together with N(20, 25/100000 ). What is your conclusion from the graph.

Plot the histogram of 2000 normalized sample mean X^-₁₀-20/√(25/10) together with N(0, 1). What is your conclusion from the graph?

Plot the histogram of 2000 normalized sample mean X^-₁₀₀₀₀₀-20/√(25/100000) together with N(0, 1). What is your conclusion from the graph?

Change the parameters of the underlying distribution N(20, 25) to whatever you like, follow the same procedure again. Can you draw the same conclusion?

(b) Do the same thing as question (a) asked if Binomial Distribution B(40, 0.2) is the underlying population distribution.

(c) Do the same thing as question (a) asked if t-Distribution t(10) is the underlying population distribution. When you change your own parameters as the last question asked, make sure the degree of freedom be greater than 2, otherwise the sample mean may not converge in Normal Distribution, think about why?

(d) Do the same thing as question (a) asked if F-Distribution F(8,6) is the underlying population distribution. When you change your own parameters as the last question asked, make sure the degree of freedom df2 be greater than 4, otherwise the sample mean may not converge in Normal Distribution, think about why?

Part 3: Extra Credit

Find a new distribution, draw samples from that distribution and follow the same procedure in the previous two parts. See if LLN and CLT still hold.

Requirement: Please submit your Rcode and Final Report to sakai. Final Report must be converted into PDF format. You are encouraged to collaborate with each other but you have to write your own code and report.

Attachment:- Assignment Files - Central Limit Theorem.rar

Reference no: EM132332201

Questions Cloud

Analyse commercial aspects of a construction project : QSP7BEC - Building Economics - University College of Estate Management - explains how the developer's profit on the project would be established.

Elements from the course including the following sequence : Utilize elements from the course including the following sequence: resources, capabilities, core competencies, and competitive advantages.

Transformational leadership model : Explain the four transformational factors associated with the transformational leadership model.

Different learning and development methods : Analyse the advantages and limitations of different learning and development methods

Plot the histogram of the samples : Randomly draw 10000000 samples from N(4, 16), plot the histogram of the samples and underlying population distribution together

Effectiveness of different learning and development methods : Evaluate the effectiveness of different learning and development methods

Development needs from a range of information sources : Support individuals in identifying their current and likely future learning and development needs from a range of information sources

Explain the structure and operation of a key-value store : Your company has decided to adopt a NoSQL database management system for storage and processing of big data. You have been asked to investigate alternative NoSQ

Describe the length-tension relationship of muscle : Discuss the processes leading to voluntary movement in the brain. Describe the length-tension relationship of muscle? What does this mean in practical terms?

User Account

All Pages