Reference no: EM131003294
Problem 1 [EM]
Write a 2-dimensional RNG for a Gaussian mixture model (GMM) pdf with 2 sub-populations. Use any function/sub-routine available in your language of choice.
Implement the expectation maximization (EM) algorithm for estimating the pdf parameters of 2-D GMMs from samples.
Compare the quality and speed your GMM-EM estimation on 300 samples of different GMM distributions (e.g. spherical vs ellipsoidal covariance, close vs well-separated subpopulations).
Problem 2 [Testing Faith]
Download the "old faithful" data set from blackboard. This has samples of a 2-D random variable: the first dimension is duration of the geyser eruption, the second is waiting time for the next eruption. Apply your GMM-EM algorithm to fit the data to a GMM pdf.
How many EM iterations are needed for convergence? Plot a contour plot of your final GMM pdf. Overlay the contour plot with a scatterplot of the data set. How would you use the GMM pdf estimates to cluster the data?
Problem 3 [Noise in GMM-EM]
Modify your GMM-EM routine by sampling and injecting Gaussian noise into the old faithful data at each iteration. Scale the noise to a fraction of the standard deviation in each dimension. And let the noise standard deviation decay at each iteration (e.g. inversely proportional to the square of the iteration counter).
Compare the average convergence time of the GMM-EM with and without noise. Plot the average convergence time for different initial noise standard deviations.
Turn in:
- A summary of your experiments including any relevant plots
- brief discussions of the results
- a print out of your code.
The following link gives a detailed tutorial about EM algorithms with close emphasis on EM for GMMs in section 3.2: https://mayagupta.org/publications/EMbookGuptaChen2010.pdf
You are not required to go through the whole paper in detail. But I recommend it.
Section 2.2 of this next paper also discusses the update equations for the GMM-EM algorithm: https://sipi.usc.edui-kosko/Noisy-Clustering-Neural-Networks.pdf
You can find more details about clustering with GMM-EM in the second paper linked above.
Pick a health-related organization and study its website
: There are many sites on the Internet that have information on healthcare and health information technology. Pick a health-related organization and study its website. Prepare a 1-2 paper describing the information and resources found on the site. E..
|
What the benefits of having a flexible budget
: What the benefits of having a flexible budget? What are some of the downsides of having a flexible budget? Do you think the budget busters are accurate? Why or why not
|
Calculate the monthly specific returns
: Calculate the monthly specific returns (i.e. the residuals of the regressions) for CBA, WES and BHP for the 84-month in-sample period - Calculate the historical variance of the monthly specific returns for CBA, WES and BHP for the 84 month in-sampl..
|
Why is stock-based compensation added to net income
: How is the undelivered portion of Microsoft's sales of Windows XP Professional recorded initially?
|
How many em iterations are needed for convergence
: How many EM iterations are needed for convergence? Plot a contour plot of your final GMM pdf. Overlay the contour plot with a scatterplot of the data set.
|
Spring 2002 exam
: Ten years ago, an advertising agency took a random sample of 20 personal computer owners who used brands such as Dell, Gateway, Hewlett-Packard, etc. At the time, the agency recorded a satisfaction score, with a maximum of 100 possible points.
|
Do you feel that the tools and technologies are appropriate
: Then, review the tools and technologies posted by your classmates. Imagine that you were going to be tasked to assist them with their project. Based on their recommendations, do you feel that the tools and technologies are appropriate? Would you r..
|
Describe the steps involved in the innovation planning
: Analyze the internal and external factors related to designing innovation strategies. Describe how they relate to one another and to the overall innovation process.
|
Find the function f for the given differential equation
: Find the function f(x,y) whose differential is df = (x+y)-1 dx + (x+y)-1 dy and which has the value f(1,1)=0. Do this by performing a line integral on a rectangular path from (1,1) to (x1,y1) where x1>0 and y1>0.
|