What is the cutoff for high leverage in the given scenario

Assignment Help Basic Statistics
Reference no: EM131155919

STATISTICS Homework

1. Load the dataset "PatientSatisfaction.txt" into R. The goal of this analysis is to determine the best subset of predictor variables for determining patient satistfaction.

(a) Indicate which subset of predictor variables are optimal according to the following criteria: AICp, Mallow's Cp, BICp, and PRESSp (i.e. use best subsets variable selection).

(b) Do the four criteria listed above identify the same optimal model? Will this always be the case?

(c) Would forward stepwise regression have any advantages as a screening procedure over best subsets selection?

2. Load the dataset "Bears.csv". Data from n = 19 female wild bears of varying ages are used to estimate the relationship between Y = weight and X = neck circumference.

(a) One of the observations takes on value (x, y) = (10.5, 140). Identify this observation in the dataset. Visually, does this observation appear to be an outlier with respect to any of the following: X, Y, or general the linear relationship between Y and X (i.e. Y |X)? Justify with the appropriate plots.

(b) Compute the leverage for the observation (x, y) = 10.5, 140). What is the cutoff for high leverage in this scenario? Using the rule-of-thumb for leverages presented in class, state whether this point has high leverage?

(c) What is the consequence of including a point with high leverage?

(d) Using the lm() output, calculate each of the following for the (x, y) = (10.5, 140). For some, you will also need the leverage that was calculated above.

i. Studentized residual
ii. Studentized deleted residual
iii. Standardized DFFITS value.

(e) Using the quantities calculated in the previous part should (x, y) = (10.5, 140) be flagged as an outlier?

(f) For the observation (x, y) = (10.5, 140), calculate the following and justify whether this point has strong influence on the model fit?

i. DFBETA
ii. Cook's Distance

3. Data were collected from n = 51 "states" (including the District of Columbia) on the salaries of public school teachers.

(a) Regress Y = average teacher annual salary on X1 = spending per pupil in dollars, X2 a dummy indicator (1/0) for region 2, and X3 = a dummy indicator for region 3. Plot the standardized residuals versus fitted values.

(b) Plot a histogram of the studentized deleted residuals. Are there any outliers in this data? If so, list the index number.

(c) Create a plot of leverages from this model. Are there any outliers with respect to the covariates?

(d) Create plots of Cook's Distances and DFFITS to determine whether any observations have strong influence on the model fit.

Attachment:- HW_Data.zip

Reference no: EM131155919

Questions Cloud

Examine the political philosophies of each court : Examine the political philosophies of each court, and indicate significant changes in the law concerning your chosen issue that was witnessed through each court's era. Then, examine the current makeup of the U.S. Supreme Court and modern trends in..
Discuss the major factors in today society : Discuss the major factors in today's society that have made the need for independent audits much greater than it was 50 years ago.
Explain the relationships among audit services : Explain the relationships among audit services, attestation services, and assurance services, and give examples of each.
Discuss the reasoning behind this measure of risk : ome financial theorists consider the variance of the distribution of expected rates of return to be a good measure of uncertainty.- Discuss the reasoning behind this measure of risk and its purpose.
What is the cutoff for high leverage in the given scenario : What is the cutoff for high leverage in this scenario? Using the rule-of-thumb for leverages presented in class, state whether this point has high leverage?
Calculate the work in the compressor and the heat removed : A stream that contains a mixture of methane (25% by mol) and carbon monoxide is compressed from 1 bar, 35 to 12 bar. The compressor efficiency is 90%. Treating the mixture as an ideal gas, calculate the required work.
The institute of internal auditors : The Institute of Internal Auditors (IIA) is an international professional association of more than 170,000 members with global headquarters in Altamonte Springs, Florida. Throughout the world, The IIA is recognized as the internal audit profession's ..
Identify the internal control deficiencies and recommend : Superior Co. manufactures automobile parts for sale to the major U.S. automakers. Superior's internal audit staff is to review the internal controls over machinery and equipment and make recommendations for improvements when appropriate. The internal..
What is the composition of the final mixture : Compartment A contains 0.2 mol of pure methane at 50 °C, 1 bar. Compartment B contains 0.8 mol of a methane-ethane mixture at 100 °C, 1 bar with ymethane = 0.5. The partition is removed and the system reaches equilibrium.

Reviews

Write a Review

Basic Statistics Questions & Answers

  Probability that a randomly selected electric bill

Problem: Monthly electric bills in a large city are normally distributed with a mean of $240 and a standard deviation of $45. What is the probability that a randomly selected electric bill has a value between $280 and $320?

  Solve equations and inequalities

How do you know if a value is a solution for an inequality? How is this different from determining if a value is a solution to an equation?

  Chloride level of all healthy floridaresidents

A random sample of 60 healthy residents has a mean chloride level of 98 mEq/L. If it isknown that the chloride levels in healthy individuals residing inFlorida have a standard deviation of 40 mEq/L, find a 90% confidence intervalfor the true mean ..

  Procedure that takes a list as argument and returns

QModify your reverse procedure of exercise 2.18 to produce a deep-reverse procedure that takes a list as argument and returns as its value the list with its elements reversed and with all sublists deep-reversed as well. For example,

  A board game enthusiast wonders

A board game enthusiast wonders whether the die provided in her board game is fair. She rolls the die 25 times and observes "6" only one time. State the relevant statistical model and hypotheses. Calculate the p-value and provide a conclusion. ..

  Determining the affirmative-action litigation

As part of an affirmative-action litigation. records were prod uced showing the average salaries earned by White, Black, and Hispanic workers in a large manufacturing planL Three different departments were selected at random for the comparison. The..

  A researcher randomly divides two groups of people

A researcher randomly divides two groups of people

  Suppose that in one region of the country the mean amount

suppose that in one region of the country the mean amount of credit card debt per household in households having credit

  A set of data is normally distributed with a mean of 200

a set of data is normally distributed with a mean of 200 and a standard deviatioon of 50.-what would be the standard

  Distance between points in rectangular coordinate system

In rectangular coordinate system, determine the distance between the points (-4,1) and (8,6).

  Find confidence interval for the true mean heart rate

The mean heart rate is 90 beats per minute with a standard deviation of 5. Find the 98% confidence interval for the true mean heart rate of all people with this untreated condition.

  Regression model for predicting the difference

Develop a regression model for predicting the difference in the asking price and selling price.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd