Provide an estimate of your error rate

Assignment Help Applied Statistics
Reference no: EM131314287

Modern Applied Statistics Assignment -

Q1: During the past week you have been working on a group assignment to analyze a specific data. As part of this exam please report on that project. This time the work has to be individual. You can still use the code generated as a group and modify/improve on it.

a. Provide a two-page write-up (including graphs) explaining your analysis of the dataset and the conclusions you can draw from it.

b. As a secondary component provide annotated code that replicates your analysis.

Q2: The file "y.csv" contains a dataset consisting of 200 observations and 6 variables. There are four groups in the dataset. For the first 160 observations you are given the true group memberships (ID's). In the dataset, the first variable represents the group membership of each observation. Your task is to build a certain model to classify the remaining 40 observations with "NA" Id's in the dataset.

a. Provide a page write-up (including graphs) explaining what methods you used for explanatory analysis and to model the groups and how you predicted the identity of the remaining 40 observations.

b. Provide an estimate of your error rate. Out of the 40 observations how many do you think you identified correctly?

c. As a secondary component provide annotated code that replicates your analysis.

Q3: (Vole Data)- Consider the microtus dataset in the Flury library in R.

Background from Airoldi et al. 1995:

Discrimination Between Two Species of Microtus using both Classified and Unclassified Observations.

Introduction : Microtus subterraneus and M. multiplex are now considered to be two distinct species (Niethammer, 1982; Krapp, 1982), contrary to the older view of Ellerman & Morrison-Scott (1951). The two species differ in the number of chromosomes: 2n=52 or 54 for M. subterraneus, and 2n=46 or 48 for M. multiplex. Hybrids from the laboratory have reduced fertility (Meylan, 1972), and hybrids from the field, whose karyotypes would be clearly recognizable, have never been found (Krapp, 1982). The geographic ranges of distribution of M. subterraneus and M. multiplex overlap to some extent in the Alps of southern Switzerland and northern Italy (Niethammer, 1982; Krapp, 1982). M. subterraneus is smaller than M. multiplex in most measurements, and occurs at elevations from 1000 m to over 2000 m, except in the western part of its range (for example, Belgium and Brittany), where it is found in lower elevations. M. multiplex is found at similar elevations, but also at altitudes from 200300 m south of the Alps (Ticino, Toscana). The two chromosomal types of M. subterraneus can be crossed in the laboratory (Meylan, 1970, 1972), but no hybrids have so far been found in the field. In M. multiplex, the two chromosomal types show a distinct distribution range, but they are morphologically indistinguishable, and a hybrid has been found in the field (Storch & Winking, 1977). No reliable criteria based on cranial morphology have been found to distinguish the two species. Saint Girons (1971) pointed out a difference in the sutures of the posterior parts of the premaxillary and nasal bones compared to the frontal one, but this criterion does not work well in many cases. For both paleontological and biogeographical research it would be useful to have a good rule for discriminating between the two species, because much of the data available are in form of skull remains, either fossilized or from owl pellets. The present study was initiated by a data collection consisting of eight morphometric variables measured by one of the authors (Salvioni) using a Nikon measure-scope (accuracy 1/1000 mm) and dial calipers (accuracy 1/100 mm).

The sample consists of 288 specimens collected mostly in Central Europe (Alps and Jura mountains) and in Toscana. One peculiar aspect of this data set is that the chromosomes of 89 specimens were analyzed to identify the species. Only the morphometric characteristics are available for the remaining 199 specimens.

Develop a generalized linear model from the 89 specimens that you can use to predict the group membership of the remaining 199 specimens.

a. Explain your GLM and assess the quality of the fit with the classified observations. Use Cross Validation to predict the accuracy of your model.

b. Provide a one-page write-up (including graphs) explaining your analysis of the dataset and your recommendations on the usefulness of your predictions.

c. Provide predictions for the unclassified observations.

d. As a secondary component provide annotated code that replicates your analysis.

Only Question 2 and 3 needs to be done.

Reference no: EM131314287

Questions Cloud

Opportunity costs-what is the irr of an investment : Opportunity costs. What is the IRR of an investment that cost $150,000 and has OCF of $30,000 dollars a year for 2 years and $48,000 for the next three years?
What total dividends if any will it pay out : The projected capital budget of Kandell Corporation is $1,000,000, its target capital structure is 60% debt and 40% equity, and its forecasted net income is $550,000. What total dividends, if any, will it pay out?
What is the rate of flow in cubic meters per second : If the barometer reads 29 in. Hg and the temperature is 40°F, what is the pressure at a point on the auto where the wind velocity is 120 fps with respect to the auto?
Complete the truth table for the given sequential circuit : Construct a truth table and find the minimized Boolean function to implement the logic telling the director when to hire. Draw a circuit diagram for the Boolean function.
Provide an estimate of your error rate : STAT 701 Modern Applied Statistics Assignment. Provide a page write-up (including graphs) explaining what methods you used for explanatory analysis and to model the groups and how you predicted the identity of the remaining 40 observations. Provide..
Estimate the pressure inside the pipe at the pump inlet : The inlet pipe has an inside diameter of 5.95 in. and it is 10 ft long. The inlet pipe is submerged 6 ft into the water and is vertical. Estimate the pressure inside the pipe at the pump inlet.
Explain islamic law of financial contracts : Explain Islamic Law of Financial contracts as bases for just financial system. Riba, gharar and mysir as forms of injustice. Islamic contractual relations are bases for justice and avoidance of exploitation.
Determine velocity head of the fluid leaving the impeller : Using the data of Problem, determine the velocity head of the fluid leaving the impeller. What pressure rise would result from such a velocity head?
Why would a company have a positive target cash balance : Cash doesn't earn interest, so why would a company have a positive target cash balance? Explain the rationale of the view that a firm's equity can be viewed as an option.

Reviews

len1314287

12/16/2016 1:01:35 AM

In this assignment only Question 2 and 3 needs to be done, code and report needed. All work is to be performed individually without any human assistance or consultation. You may use textbooks, class notes, and software as necessary. You may ask me questions or clarifications. For each question please provide The write-up should include Overall summary - one paragraph, Introduction, Data analysis (includes Exploratory Data analysis and modeling), Results and Conclusion. Make sure to explain your models, assumptions, analysis of the data, recommendation, and challenges. Rmarkdown file that generates the word document. (optional R-code with proper annotation as a separate .R file).

Write a Review

Applied Statistics Questions & Answers

  Hypothesis testing

What assumptions about the number of pedestrians passing the location in an hour are necessary for your hypothesis test to be valid?

  Calculate the maximum reduction in the standard deviation

Calculate the maximum reduction in the standard deviation

  Calculate the expected value, variance, and standard deviati

Calculate the expected value, variance, and standard deviation of the total income

  Determine the impact of social media use on student learning

Research paper examines determine the impact of social media use on student learning.

  Unemployment survey

Find a statistics study on Unemployment and explain the five-step process of the study.

  Statistical studies

Locate the original poll, summarize the poling procedure (background on how information was gathered), the sample surveyed.

  Evaluate the expected value of the total number of sales

Evaluate the expected value of the total number of sales

  Statistic project

Identify sample, population, sampling frame (if applicable), and response rate (if applicable). Describe sampling technique (if applicable) or experimental design

  Simple data analysis and comparison

Write a report on simple data analysis and comparison.

  Analyze the processed data in statistical survey

Analyze the processed data in Statistical survey.

  What is the probability

Find the probability of given case.

  Frequency distribution

Accepting Manipulation or Manipulating

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd