Produce a scatterplot of the data and overlay a contour plot

Assignment Help Applied Statistics
Reference no: EM132375108

Homework -

Answer all questions specified on the problem and include a discussion on how your results answered/addressed the question.

Submit your .rmd file with the knitted PDF (or knitted Word Document saved as a PDF). If you are having trouble with .rmd, let us know and we will help you, but both the .rmd and the PDF are required.

This file can be used as a skeleton document for your code/write up. Please follow the instructions found under Content for Formatting and Guidelines. No code should be in your PDF write-up unless stated otherwise.

Please do the following problems from the text book R Handbook and stated.

1. The galaxies data from MASS contains the velocities of 82 galaxies from six well-separated conic sections of space (Postman et al., 1986, Roeder, 1990). The data are intended to shed light on whether or not the observable universe contains superclusters of galaxies surrounded by large voids. The evidence for the existence of superclusters would be the multimodality of the distribution of velocities.

a) Construct histograms using the following functions:

-hist() and ggplot()+geom_histogram()

-truehist() and ggplot+geom_histogram() (pay attention to the y-axis!)


Comment on the shape and distribution of the variable based on the three plots. (Hint: Also play around with binning)

b) Create a new variable loggalaxies = log(galaxies). Construct histograms using the functions in part a) and comment on the shape and differences.

c) Construct kernel density estimates using two different choices of kernel functions and three choices of bandwidth (one that is too large and "oversmooths," one that is too small and "undersmooths," and one that appears appropriate.) Therefore you should have six different kernel density estimates plots. Discuss your results. You can use the log scale or original scale for the variable.

d) What is your conclusion about the possible existence of superclusterd of galaxies? How many superclusters (1,2, 3, . . . )?

e) How many clusters did it find? Did it match with your answer from (d) above? Report parameter estimates and BIC of the best model.

2. The birthdeathrates data from HSAUR3 gives the birth and death rates for 69 countries (from Hartigan, 1975).

a) Produce a scatterplot of the data and overlay a contour plot of the estimated bivariate density.

b) Does the plot give you any interesting insights into the possible structure of the data?

c) Construct the perspective plot (persp() in R, GGplot is not required for this question).

d) Model-based clustering (Mclust). Provide plot of the summary of your fit (BIC, classification, uncertainty, and density).

e) Discuss the results (structure of data, outliers, etc.). Write a discussion in the context of the problem.

3. A sex difference in the age of onset of schizophrenia was noted by Kraepelin (1919). Subsequent epidemiological studies of the disorder have consistently shown an earlier onset in men than in women. One model that has been suggested to explain this observed difference is known as the subtype model which postulates two types of schizophrenia, one characterized by early onset, typical symptoms and poor premorbid competence; and the other by late onset, atypical symptoms and good premorbid competence. The early onset type is assumed to be largely a disorder of men and the late onset largely a disorder of women. Fit finite mixutres of normal densities separately to the onset data for men and women given in the schizophrenia data from HSAUR3. See if you can produce some evidence for or against the subtype model.

Attachment:- Assignment Files.rar

Reference no: EM132375108

Questions Cloud

Minnesota judgment from being collected against medspa : Was this argument alone sufficient to prevent Minnesota judgment from being collected against MedSpa? Make arguments for both parties,
Implement to attract and retain top talent : What practices should firms such as Dewey & LeBoeuf implement to attract and retain top talent?
Warehouse operations responsible for storing holiday : You are in charge of warehouse operations responsible for storing holiday ornaments. Upon your work analysis, you found that you would be short of 300
Dominant image of change management : There are six images listed, and they are Director, Navigator, Caretaker, Coach, Interpreter, and Nurturer
Produce a scatterplot of the data and overlay a contour plot : STAT 601 Homework - Produce a scatterplot of the data and overlay a contour plot of the estimated bivariate density
Punishment alternative within corrections : CRJ316- Discuss the role of intermediate sanctions as a punishment alternative within corrections. evaluate the position taken by classmate regarding whether
Overview pertaining to the investigating of child abuse : His week's readings provides an overview pertaining to the investigating of child abuse. what you have learned about investigating alleged child abuse.
Case automatic weapon is defined as firearm : LSTD301- In this case an automatic weapon is defined as firearm that continuously fires so long as user presses the trigger and there is ammunition in gun
Provide informationally adequate descriptive statistics : Provide informationally adequate descriptive statistics. Describe the results of your evaluation of the assumption of multivariate normality



9/23/2019 9:45:46 PM

Answer all questions specified on the problem and include a discussion on how your results answered/addressed the question. Submit your .rmd file with the knitted PDF (or knitted Word Document saved as a PDF). If you are having trouble with .rmd, let us know and we will help you, but both the .rmd and the PDF are required. This file can be used as a skeleton document for your code/write up. Please follow the instructions found under Content for Formatting and Guidelines. No code should be in your PDF write-up unless stated otherwise.


9/23/2019 9:45:40 PM

For any question asking for plots/graphs, please do as the question asks as well as do the same but using the respective commands in the GGPLOT2 library. (So if the question asks for one plot, your results should have two plots. One produced using the given R-function and one produced from the GGPLOT2 equivalent). This doesn’t apply to questions that don’t specifically ask for a plot, however I still would encourage you to produce both.

Write a Review

Applied Statistics Questions & Answers

  Describe the type of qualitative research

Describe the type of qualitative research that was conducted, such as phenomenology, case study, ethnography, grounded theory, or generic qualitative inquiry.

  A sample of domestic new cars

A sample of domestic new cars.

  Develop a lp model to minimise the total cost

Develop a LP model to minimise the total cost for the Wade Co. to meet the demands for the next three months. Use a suitable computer software to solve the LP problem.

  Evaluate the expected outcome

Virgil is evaluating an investment that has a 15% chance of losing $10,000, a 50% chance of breaking even, a 30% of making $80,000 and a 5% chance of making $400,000.  Evaluate the expected outcome of this investment

  Simulation experiment using a statistical computer package

Consider the four sample sizes n = 10, 20, 30, and 50, and in each case use 500 replications - For which of these sample sizes does the x‾ sampling distribution appear to be approximately normal?

  Create a histogram for iq and include normal distribution

Use Analyze, Descriptive statistics, Frequencies to summarize nonmetric variables, and create a pie-chart for Year in college and create a histogram for IQ and include the normal distribution.

  Create a tableau sheet with analysis of the rules

Complete all steps and keep pasting your commands and outputs in a word file. Create a Tableau sheet with analysis of the rules

  How should your investment in each asset

How should your investment in each asset (i.e., a, b, c) change if you do not wantthe standard deviation of your portfolio to exceed 0.013?

  1 length of pregnancies the length of human pregnancies

1. length of pregnancies the length of human pregnancies from conception to birth varies according to a distribution

  Re-draw the scatter plot and the least square line

STAT102: BUSINESS DATA ANALYSIS: FACTS FROM FIGURES Assignment, Australian Catholic University. Re-draw the scatter plot and the least square line without Year

  Calculate the statistical linear regression line for data

Calculate the statistical linear regression line for the data below. Interpret the excel output. Use the equation of the line to predict the cost for year 7.

  Correlation between the amounts of profits and total sales

Is there a positive correlation between the amounts of profits (Dependent Variable) and total sales (Independent Variable) based on the costs of the process of production and distribution?

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd