Produce and interpret perspective and contour plots

Assignment Help Applied Statistics
Reference no: EM132278467

Assignment -

Question 1 - The data file 'film.txt' contains data for five variables. The thickness of a plastic film is measured in 4 positions after being cut. The position of the measurements are: top right, top left, bottom right and bottom left.

Provide R code, output and written interpretation for parts a) to d) of this question. Provide only output that is directly relevant to address each section.

Test for multivariate normality (MVN) by:

a) Describe the structure of the film.txt data.

b) Produce and interpret univariate QQ plots and histograms and univariate Shapiro-Wilks tests of normality for each of the four film thickness variables. Which is the most non-normally distributed variable?

c) Produce and interpret perspective and contour plots for the top-right and top-left film thickness variables. What is an inherent problem with using these plots to assess MVN?

d) Do the analysis necessary to provide the results of the Mardia, Henze-Zirkler and Royston tests of MVN based on all four film thickness variables. Include in your interpretation:

  • The Chi-Square QQ plot and describe how it is constructed and its relationship to the univariate normal QQ plots as part of your interpretation.
  • What is a key limitation of these MVN statistical tests?

e) One way to try and meet the MVN assumption could be to remove some of the variables from the multivariate analysis (do not perform this analysis). Suggest three additional ways that you might improve univariate and multivariate normality for data sets in general.

f) In part e) we suggested removing some variables to try and help the data approach MVN. Suggest one other reason why reducing the number of variables used in multivariate analysis may be important (this question does not relate to this particular data set)?

Question 2 - The data file 'iris.txt' contains data for four flower characteristics variables for three species of iris.

Provide R code, output and written interpretation for parts a) to f) of this question.

a) Produce a draftsman display for the 4 flower characteristics variables. Interpret these plots, relating back to the original data where it may add to the interpretation. What are the y and x axes on plot [3,2] of the draftsman plot?

b) In the context of MANOVA, list the dependent and independent variables and define the relationship that the MANOVA would test.

c) Produce the correlation matrix for the flower characteristics variables. Provide an interpretation of the correlations and indicate what they suggest about the potential for the variables to be MVN distributed? (do not test for MVN).

 d) Using MANOVA in R, test for differences in 'flower characteristics' between the three species. Include tests using all four test statistics covered in this course and interpret output (assume the assumption of MVN is met).

e) Why is a small Wilks' lambda statistic likely to indicate significant differences between at least some groups? Which of the four tests used in part d) would be the best to interpret if there are concerns about multivariate normality or covariance equality?

f) Produce output that specifically compares each of the Groups with each other (you should have 3 comparisons) using Hotelling's T2 t-test equivalent and a significance level of 0.05. Determine the multiple test corrected significance level. Do not provide R output; instead reproduce and complete the following table for all comparisons and interpret. How may sample sizes have affected these results and those in part c)?

Comparison

Hotelling's p-value

Significant (Y/N)

Significant after correction (Y/N)

Species 1

Species 2

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Question 3 - The data file 'usair.dat' contains data for seven air quality variables measured across 41 United States cities. Provide R code, output and written interpretation for all analyses.

a) Produce the correlation and covariance matrices. Explain the difference between these matrices in detail (i.e. explain clearly how the values are adjusted mathematically and the effect of these changes). Would using the covariance matrix in PCA on the USair data be appropriate? Why?

b) Perform PCA analysis on the 7 variables using the prcomp function. Discuss the eigenvalues, %variation and scree plot and how they influence your decision on how many PCs to interpret from this analysis. Remember to keep in mind the overall purpose of PCA.

c) Interpret the first PC. Include the Z equation and a plot of the loadings on the first PC in your answer.

d) What is the correlation between the first and second PCs and what does this tell you?

e) Produce and interpret a biplot based on the first 2 PCs. In particular, explain your interpretation of the air quality variables in city 1 compared to city 11 and city 9. Relate your interpretation back to the original data.

f) Was this a useful analysis for this data set? Explain.

Question 4 - For this question you will continue to use the data file 'usair.dat' from Question 3. Provide R code, output and written interpretation for all analyses.

a) Perform parallel analysis and evaluate how many PC's should be used in FA. Compare to your choice of number of PC's in Q3b).

b) Explain in your own words how the parallel analysis works.

c) Perform a Factor Analysis on all 7 variables (apply no rotation) using the number of factors you identified in part a). Interpret the output including the:

  • Variance explained
  • Chi-square test
  • Variable loadings
  • Difference in uniqueness values for the variables wind.speed and annual.precip

d) Repeat the FA with a varimax rotation and calculate the communalities. Interpret

  • Explain the aim and features of a varimax rotation
  • Changes in the variable loadings
  • The communalities.

Attachment:- Assignment Files.rar

Reference no: EM132278467

Questions Cloud

Individual states to enact discriminatory policy : Some argue that American federalism allows individual states to enact discriminatory policy, while others argue that it promotes innovation and participation.
Describe programs-initiatives within the affordable care act : Describe programs / initiatives within the Affordable Care Act ACA aimed at restructuring the current health insurance marketplace.
Make sure your health insurance coverage is not interrupted : If you do elect COBRA coverage, what are some important considerations to make sure your health insurance coverage is not interrupted?
Difference in how you handle male vs. female employees : How do you find that internal guest? Is there a difference in how you handle male vs. female employees?
Produce and interpret perspective and contour plots : Produce and interpret perspective and contour plots for the top-right and top-left film thickness variables. What is an inherent problem
Your company routinely uses ups to send packages : An angry customer rudely demands to know why his package has not yet arrived. Your company routinely uses UPS to send packages.
Business ventures in real life : Explain how you might use or need analytics in your business ventures in real life, at your work, or for the business you created for this class.
Open to prevent the line from growing infinitely long : What is the minimum number of windows that must be open to prevent the line from growing infinitely long?
About violating any ethics rules or laws : you stand on a moral high-ground in your personal and professional life so that you will not have to worry about violating any ethics rules or laws.

Reviews

len2278467

4/9/2019 9:36:18 PM

Instructions: Submit only one file in pdf format to the link on the Study Desk. Assume that your report will be read by someone familiar with the data sets but with limited statistical knowledge. Fully explain plots and when stating statistics or results explain what they mean statistically AND in context of the data. Presentation should be neat, consistent, spell-checked and proof read. All questions should be clearly labelled, and all answers should clearly and concisely address the questions. If you convert a Word document to pdf for submission check that all symbols, equations etc. have converted correctly, i.e., proof-read your work.

len2278467

4/9/2019 9:36:10 PM

All answered must be typed – do not include handwritten/scanned or stylus/tablet written responses in your document. If you do not use knitr to compile your submission, where asked to provide R code, paste relevant code within the assignment document and italicise (or otherwise highlight or distinguish from other content). Do not include code in an appendix.

len2278467

4/9/2019 9:36:02 PM

Do not include an appendix at all. Any work included in an appendix will not be marked. Please note that referencing text books and other resources is not the goal of this assessment. This work requires students to demonstrate their understanding of the analysis and interpretation, not provide quotes from resources. When interpreting output, you are expected to do so in context of the data and the method (i.e. ensure you comment on aspects of the method that affect your interpretation with the respect to the variables and sample). A maximum of 10 marks will be deducted from your total marks for poor presentation.

Write a Review

Applied Statistics Questions & Answers

  Explain the limitations presented by the study population

Describe the differences in the results between the groups in the study and support your description with examples from the study

  Find the optimal solution using the ibm cplex software

SIT718 - Briefly explain the general relationship between each of your transformed variables and the variable of interest and Build models and investigate

  What is the average rating for all cbc movies

What is the average rating for all CBC movies? How about ABN movies and BBS movies and create a line graph of the monthly average ratings for CBC for the year

  Construct a frequency bar chart for the given data

Construct a frequency bar chart for these data. Construct a pie chart for these data. Which professional sports league is most popular with these 50 adults? Which is least popular?

  Prepare a sas data set with 45 rows

If we prepare a data set for exercise #9 of chapter 14 (question 3 of this homework), we can use SAS to answer all the questions in parts a), b), c) and d) of this exercise. The sample size (n) for this exercise was 45 and of these 45 third-graders 4..

  Aggregate demand and aggregate supply

Aggregate Demand and Aggregate Supply

  The new machinery has been frequently failing

A microwave manufacturing company has just switched to a new automated production system. Unfortunately, the new machinery has been frequently failing and requiring repairs and service. The company has been able to provide its customers with a ..

  Calculate the rate of claims

Identify the exposure input, i.e which variable in Claims gives the exposure? Explain your answer - what is the total number of categories for counts

  Discuss about the cluster sampling

Cluster Sampling: It is used by dividing out the samples into groups. For example sort out in a city the actual number of kids, elderly and newborns.

  Sample of customer order totals with an average

1) A random sample of customer order totals with an average of $78.25 and a population standard deviation of $22.50. Calculate a 90% confidence interval fro the mean, given a sample size of 75 orders

  Prioritize the attributes of golds relings brand

Prioritize the attributes of Golds Reling's brand from the brand map presented in the scenario according to the attributes that you believe would be most import

  The degrees of freedom for the critical value of f are?

An ANOVA procedure is used for data that was obtained from four sample groups each compromised of five observations. The degrees of freedom for the critical value of f are?

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd