Construct the appropriate dummy variable

Assignment Help Applied Statistics
Reference no: EM131376203

Question 1:

The production manager of American Tool and Castings Company is conducting a study regarding the relationship between the numbers of alloy caps milled on a lathe versus the measure of distance from specification of outside cap diameters.  The lathe uses a sharp steel cutting tool in a milling process to cut and shape raw alloy bars into caps.  The lathe tool turns at a high speed while cutting into the alloy, in essence, cutting the alloy down to size and shaping it to resemble a round cap.  A similar lathe tool cuts into the inside of the cap.  The caps are later fit with interior gaskets and permanently sealed onto airtight canisters.    After the steel cutting tool is used repeatedly, the tool begins to wear, hence cutting a larger outside cap diameter than desired.  If the outside cap diameter is too large the cap can't be properly affixed and sealed to the canister.  The production manager would like to build a model to estimate/predict how many caps a tool can mill until it wears down too much, hence milling caps that are too large in diameter and unusable.  Each cap costs approximately $400 to mill, so defective caps are expensive.  The main variable of interest (y) is "distance from specification" of outside cap diameter.

To conduct the study, 62 lathe tools were randomly sampled.  Each lathe operator keeps a record of the number of caps milled by particular tool.  Each cap milled is measured to see how close to specification the outside diameter is.  According to specification, each cap should be 6 inches in diameter. For example, the measure in record one is 0.36, meaning it was 0.36 inches larger than specification. When each tool was sampled, the number of caps milled by the tool was recorded, as well as a measure from specification of the diameter of the last cap milled by that particular tool.  The data for each cutting tool sampled and the measure of distance from specification of the outside cap diameter of the last cap milled is in the spreadsheet labeled American.

1. Scatter plot

Construct a scatter plot revealing the relationship between the number of caps milled by a tool and the distance from specification of the outside cap diameter.  Make sure the x variable is on the x-axis and the y-variable is on the y-axis.  Move the chart so that it starts in cell E3.  Do not resize the chart beyond the red shaded region.

2. Correlation

Using a built-in Excel function in cell F22, calculate the correlation (r) between the number of caps milled by a tool and the distance from specification of the outside cap diameter.

In cell F23, indicate the strength of the linear relationship as very strong, relatively strong, very weak, relatively weak, or no relationship.

In cell F24, indicate if the relationship is positive or negative.

3. Anchoring the output in cell P3, generate the regression output.  Make sure you select an appropriate "Residual Plot," and place the residual plot in the designated area near cell E32.

4. Output

In cells J23 and J24, enter the value of the intercept and slope (respectively) by referencing the appropriate cells in the regression output.

In cell K24, enter the value of the t test statistic for testing the slope significance by referencing the appropriate cell from the regression output.

In cell L24, enter the p-value regarding the slope significance by referencing the appropriate cell from the regression output.

In cell M24, indicate with the word "Yes" or "No" if the slope coefficient is significant.  Assume α.01.

5. In cell F29, provide the predictive power (a.k.a. the coefficient of determination) of the model by referencing the appropriate cell from the regression output.

6. In cell J29, write the prediction equation relating NM to DS using the intercept and slope values.  This is a text input that starts with a number, so you must start the input with a space to trick Excel into interpreting the input as text.  For example, if a = 4 and b = 10, enter 4 + 10(NM), placing a space before the value 4.

7.

Cell E32 should contain the residual plot.  Keep the plot within the red shaded area.

In cell F48, comment on the assumption of linearity as interpreted using this residual plot.

In cell F49, comment on the assumption of constant variance as interpreted using this residual plot.

8. Prediction and Residual

In cell F53, predict the distance from specification of a cap milled by a tool when the cap is the 20th cap to be milled.

In cells F54 and F55, calculate the lower and upper values for the range of definition for this data set.

9. Prediction Interval

Using the table in cells J52:K53 as the Predication Data Set and StatTools, calculate the lower limit and upper limit for a 95% prediction interval for the DS of a cap that is the 20th cap milled.  Anchor your StatTools Regression output in cell A1 of the Regression Worksheet Place the values in cells J58 and K58 by referencing the appropriate cells in the StatTools output.  Note that this will shift the columns of your worksheet.

Question 2:

A mental health agency measured the self-esteem score for randomly selected individuals with disabilities who were involved in some work activity within the past year.  The spreadsheet named Self Esteem provides the data including each individuals self-esteem measure (y), years of education (YrsEdu), age, months worked in the last year (MonWork), marital status dummy variables (MS2, MS3, MS4) indicating if the individual is single, married, separated, or divorced, and a support level (SL) dummy variable indicating if the level of job support (counseling, etc) was provided directly (1) or indirectly (0).  Regarding marital status, if single all MS indicators are 0, while MS2 = 1 indicates married, MS3 = 1 indicates separated, and MS4 = 1 indicates divorced.

In cell N4, use Excel's "Correlation" Data Analysis tool to construct a correlation matrix for all the variables.   Note that the categories in columns I and J should not be included since the data are already represented as dummy variables in columns E through H.

Considering the correlation between self esteem and each x variable identify the three variables that, based on correlation with y alone, should be considered as best candidates for inclusion in the model.  Shade the appropriate cells containing the correlation values in yellow.  Ignore any multicollinearity concerns for this part.

Considering the correlation between each pair of x variables, identify the variables that would possibly cause multicollinearity problems if included in the model.  Shade the appropriate cells containing the correlation values in green.

Based on your conclusions in parts b and c, shade in red color the names of any variables that should not be included in the initial model because of possible multicollinearity problems.

With cell N19 as the upper left hand corner of the output, fit the full regression model. (Do not include a residual plot)

Considering the regression output from part e, shade (in yellow) the name of any x variable that appears significant and should remain in the model.  Also shade the t stat and p-value.  Consider the p-value small if it is less than 0.05.

Partial Regression Model: With cell N51 the upper left hand corner of the output, fit the model including only the x variable(s) that were found to be significant in part f.  (Do not include a residual plot)

Question 3:

A bank must prepare for a discrimination suit filed on behalf of female employees that claim females are paid less than male employees.  The bank manager sampled employee files to see if he could build a useful model for predicting salary as a function of gender and other characteristics.  For each employee, the data includes salary (y, in thousands of dollars), years experience (YrsExp), years prior experience (YrsPrior), and Gender.  The data is in the spreadsheet named Bank.

1. Since Gender is a categorical variable, construct the appropriate dummy variable in column E to indicate gender as female = 1 and male = 0.  You must use an "IF" statement in the appropriate cell(s) to indicate the correct dummy value based on gender.

2. With cell H7 the upper left hand corner of the output, fit the full model.  (Do not include a residual plot).

3. Based on the regression output from part b, shade (in yellow) the name of any x variable that appears significant and should remain in the model.  Also shade the t stat and p-value.

Attachment:- Assignment.rar

Reference no: EM131376203

Questions Cloud

Create an erd from the information : The information he would like to know about each student includes ID number, name, and phone number. He also needs to know what grade the student receives in each course. He has asked you to create an ERD from the information described here using ..
Enormous amount of money : Observing that Kodak is making an enormous amount of money from their film salesand the owners of Kodak are becoming very rich, the government imposes a tax of $0.50per roll of film.
Create an erd from information described using chen model : Each equipment type is associated with a single manufacturer that is referenced by a unique two-digit manufacturer ID number. You have been hired to assist Foothills Athletics to create an ERD from the information described here using the Chen mod..
Find a business continuity or disaster recovery article : Research on the general internet or in the University Library and find a Business Continuity or Disaster Recovery article online relating to records recovery, data (anything regarding to d/r and data - including data loss) OR workstations. Summari..
Construct the appropriate dummy variable : Since Gender is a categorical variable, construct the appropriate dummy variable in column E to indicate gender as female = 1 and male = 0.  You must use an "IF" statement in the appropriate cell(s) to indicate the correct dummy value based on gen..
Proposal that increases her income : Consider first the proposal that increases her income from 10,000 to 15,000 (for example, by introducing a program like the earned income tax credit).  Let's call this policy as the cash transfer program.  Write her new budget constraint and draw ..
How the manager in an agile organization may use : Compare and contrast the advantages and disadvantages to the planning function and explain how the manager in an agile organization may use both to his or her advantage.
Develop an erd from the business rules mentioned here : A ski or snowboard need not be assigned to any customer. Your job is to develop an ERD from the business rules mentioned here.
Define the completeness axiom : Define the completeness axiom. Give a verbal explanation of a situation where the consumer's preferences are incomplete.

Reviews

Write a Review

Applied Statistics Questions & Answers

  Calculate the sample size needed

1. Calculate the sample size needed given these factors: one-tailed t-test with two independent groups of equal size small effect size (see Piasta, S.B., & Justice, L.M., 2010) alpha =.05

  What would a bar chart of attend look like

The shell of a bar chart is given below. The categories of attend appear along the horizontal axis. What would a bar chart of attend look like if this variable had maximum dispersion? Sketch inside the axes a bar chart that would depict maximum di..

  What does the significance value reveal about the data

Are they weak or strong correlations - what is the significance value and what does the significance value reveal about the data we have used?

  Set up null and alternative hypotheses for hypothesis test

The bottling company wants to set up a hypothesis test so that the filler will be readjusted if the null hypothesis is rejected. Set up the null and alternative hypotheses for this hypothesis test.

  Find two different news stories in a mainstream media

find two different news stories in a mainstream media source cnn foxnews newsweek etc. that cite data from a recognized

  Why would it be wrong to use the proportion

Why would it be wrong to use the proportion 0.28 as the proportion of all elements of statistics students who are male?

  Joe henderson runs a small metal parts shop.

Joe Henderson runs a small metal parts shop. The shop contains 3 machines- a drill press, lathe, and a grinder. Joehas three operators, each a re certified to work on all three machines. However, each operator performs better on some machin..

  Managerial report equations and analysis

Managerial Report equations and analysis.

  Analyze the processed data in statistical survey

Analyze the processed data in Statistical survey.

  Standard deviation of the sampling distribution of sample

1. The scores of students on the ACT college entrance examination in a recent year had a normal distribution with a mean of 18.6 and a standard deviation of 5.9. A simple random sample of 60 students who took the the exam is selected for study.A) Wha..

  What is the probability that the simple random sample

a. What are the mean and the standard deviation of the sample mean? What is the probability distribution for the sample mean? Describe this sampling distribution and draw the graph of this probability distribution with its mean and standard deviation..

  The distribution of scores was approximately bell

The distribution of scores was approximately bell

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd