Write the least squares equation

Assignment Help Applied Statistics
Reference no: EM131931057

Regression Analysis - Estimating Relationships Assignment

I. Oil Problem - A small sample data is collected for two variables. Y represents the price of all gas types per gallon (Bureau of Labor Statistics), and X represents the average price of crude oil per barrel (Department of energy), for a period of ten years.

Yr

y

x

2000

1.56

27.39

2001

1.53

23.00

2002

1.44

22.81

2003

1.64

27.69

2004

1.92

37.66

2005

2.34

50.04

2006

2.63

58.30

2007

2.85

64.20

2008

3.32

91.48

2009

2.40

53.48

1. Plot the Scattergram.

2. Fit a model that best fits the data, using Least Squares Method and obtain the summary output.

3. Write the Least Squares equation.

4. Predict the price of gas at the pump in a year in which the price of crude oil is $100 per bbl.

5. Superimpose the Least Squares Line in the scatter diagram above.

6. Interpret the intercept and slope in the context of the problem.

7. What is the value of SSE and what does it mean?

8. What is the value of the variance of regression, S2, and what does it mean in the context of the problem. What other notation (abbreviation) is used to represent the variance of regression?

9. What is the value of S, the standard error of estimate, and explain its meaning.

10. What is the value of the coefficient of correlation and what does it mean in the context of the Problem.

11. Compute the coefficient of determination and interpret its meaning in the context of the problem.

Note that the above are not steps required in every regression analysis. These are just some questions posed in this problem.

II. A random sample of ten used cars (Corvettes) between 1 and 6 years old were selected from a used car dealership. The following data were obtained, x represents age, in years, and y represents sales price, in hundreds of dollars.

x

6

6

6

4

2

5

4

5

1

2

y

125

115

130

160

219

150

190

163

260

260

1. Graph the data in a scatterplot to determine whether there is a possible linear relationship between the two variables. Draw the proposed model in the scatter diagram.

2. Fit a linear model to the data.

3. Write the least squares equation.

4. Interpret the regression coefficients in the context of the problem.

5. What is the value of the correlation coefficient? What does it mean in terms of the strength and nature of the relationship between the two variables?

6. Compute and interpret the coefficient of determination. Interpret its meaning.

7. Based on the coefficient of determination and the standard error of estimate, how good is the model?

III. To predict the peak power load needed, ABC Power Authority has selected a sample of 6 summer days. The data are listed below:

Temperature degrees "F"

Peak Load Megawatts

67

97.0

108

190.1

86

105

100

159.1

90

132.1

76

101.0

1. Construct a scatterplot for the data and graph a second-order polynomial.

2. Fit a second-order model to the data.

3. Write the least squares equation.

4. What is the SSE and what does it mean?

5. What is the S2, and what does in mean.

6. What is the standard error of estimate and what does it mean.

7. What is the coefficient of determination and what does it mean in the context of this problem.

8. Predict the peak power load needed in a day in which temperature is 105 degrees.

IV. The Quality of a product depends on temperature and Pressure (in PSI). Use the 27 observations in the table and answer the following questions:

Quality

Temp

PSI


Quality

Temp

PSI

50.80

80

50


97.40

90

55

50.70

80

50


70.90

90

60

49.40

80

50


68.80

90

60

93.70

80

55


71.30

90

60

90.90

80

55


46.60

100

50

90.90

80

55


49.10

100

50

74.50

80

60


46.60

100

50

73.00

80

60


69.80

100

55

71.20

80

60


72.50

100

55

63.40

90

50


73.20

100

55

61.60

90

50


38.70

100

60

63.40

90

50


42.50

100

60

93.80

90

55


41.40

100

60

92.10

90

55





1. Fit a first-order model to the data (make sure you include both independent variables.)

2. Report the equation of the model.

3. Interpret the estimated regression coefficients.

4. Report the coefficient of determination and the standard error of estimate for the first-order model. Based on these, how good is the model?

5. Fit an interaction model to the data.

6. Report the coefficient of determination and the standard error of estimate for the interaction model. Based on these, how good is the model?

7. Fit a complete second-order model to the data.

8. Report the coefficient of determination and the standard error of estimate for the complete second-order model. Based on these, how good is the model?

9. Which of the three models do you prefer? Why? Explain.

V. Part of an Excel output relating X (independent Variable) and Y (dependent variable) is shown below. Fill in all the blanks marked with "?".

Note: Please have your formula sheet handy and watch the "Regression Output Analysis" Excel Video demo file, under the Excel link, before attempting this and next problems.

Hint: MSE = (Standard error of estimate)2

Summary Output

Regression Statistics

 

Multiple R

?

R Square

0.980237

Adjusted R Square

?

Standard Error

0.096067

Observations

10

ANOVA


df

SS

MS

F

Significance F

Regression

?

?

?

?

0.0000

Residual

?

?

?



Total

?

3.73581











Coefficients

Standard Error

t Stat

P-value


Intercept

?

0.072432

11.77961

0.0000


x

?

0.001442

19.91967

0.00000


VI. Part of an Excel Summary Output relating X (independent Variable) and Y (dependent variable) is shown below. Fill in all the blanks marked with "?".

Summary Output

Regression Statistics


Multiple R

0.1347

R Square

?

Adjusted R Square

?

Standard Error

3.3838

Observations

?

ANOVA


df

SS

MS

F

Significance F

Regression

?

2.7500

?

?

0.632246859

Residual

?

?

11.45



Total

14

?











Coefficients

Standard Error

t Stat

P-value


Intercept

8.6

2.2197

?

0.0019


x

0.25

0.5101

?

0.6322


VII. A chain of clothing stores wants do develop a model that can predict sales based the store's location. A sample of past December sales in the four stores is given below

Store

2011

2010

2009

2008

2007

1

31

41

39

36

32

2

24

31

34

28

23

3

54

60

57

52

62

4

34

42

40

46

47

1. How many independent variables are needed in this problem?

2. Identify the coding scheme (introduce dummy variables and define them).

3. Propose a model that can show this relationship.

4. Fit the model to the data of the problem.

5. Make interpretation of all coefficients in the problem.

6. Graph the model.

VIII. A fast food restaurant chain is interested in modeling the mean weekly sales of a restaurant, E(y), as a function of the weekly traffic flow on the street where the restaurant is located and the city in which the restaurant is located. The table contains data that were located on 24 restaurants in four cities. The model that has been proposed is

433_figure.png

CITY

TRAFFIC FLOW (thousands of cars)

WEEKLY SALES y($ thousands)

 

CITY

TRAFFIC FLOW (thousands of cars)

WEEKLY SALES y($ thousands)

1

59.3

6.3

 

3

75.8

8.2

1

60.3

6.6

 

3

48.3

5.0

1

82.1

7.6

 

3

41.4

3.9

1

32.3

3.0

 

3

52.5

5.4

1

98.0

9.5

 

3

41.0

4.1

1

54.1

5.9

 

3

29.6

3.1

1

54.4

6.1

 

3

49.5

5.4

1

51.3

5.0

 

4

73.1

8.4

1

36.7

3.6

 

4

81.3

9.5

2

23.6

2.8

 

4

72.4

8.7

2

57.6

6.7

 

4

88.4

10.6

2

44.6

5.2

 

4

23.2

3.3

1. Write the equation of the model based on the above data. Interpret the coefficients of the model in the context of the problem.

2. If the traffic flow in front of all stores is the same, 80,000 cars, predict sales for city 1 and city 4, based on the fitted model. Which of the four city stores have the least expected sales?

3. Use the prediction equation to graph (by hand or Excel) the response lines that relate predicted weekly sales, , to traffic flow, , for each of the cities.

4. Write a model that includes interaction between city and traffic flow.

5. Fit the model of part 4 to the data.

6. Graph the interaction model.

Multiple Choice Questions:

1. The error term in simple regression represents:

a. the difference between the estimated regression line and the population line.

b. the vertical distance from any point to the mean value of Y's.

c. the vertical distance from any point to the population regression line.

d. the vertical distance from any point to the estimated regression line.

e. none of the above.

2. In a multiple regression problem involving two quantitative independent variables, if β1 is computed to be -2, it means that

a. the relationship between x1 and y is significant.

b. y decreases by 2 units for each increase of one unit of x1, holding x2 constant.

c. the value of y is -2 when x1 equals zero.

d. none of the above.

3. The coefficient of multiple determination

a. measures the variation around the predicted regression equation.

b. measures the proportion of the variation in y that is explained by all the independent variables in the model.

c. measures the proportion of the variation in y that is explained by x1 holding x2 constant.

d. will have the same sign as β1.

4. If we want to add the independent variable, gender, to our existing model, how many variables are needed to represent its two levels,

a. 1

b. 2

c. 4

d. need more information

e. none of the above

5. The graph of a model with one qualitative independent variable with three levels is

a. three lines

b. three nonlinear functions

c. two parallel lines

d. a bar chart

e. cannot be graphed

6. The least squares method guarantees that the

a. sum of absolute deviations between each observation and the model is least compared to any other model

b. sum of the deviations between each observation and the mean is zero

c. sum of squared deviations between each observation and the model is minimum compared to any other model

d. all of the above

e. none of the above

Reference no: EM131931057

Questions Cloud

Describe the assessment you used to analyze your skills : Most of us have situations in which we find it difficult to communicate. Describe the assessment you used to analyze your skill. Discuss your communication gap.
Evaluate a recent organisational change in the organisation : Critically evaluate a recent organisational change happened in the organisation and the role of management/leaders during the change.
Macroeconomic variables decline and grow : If the economy goes into a recession which macroeconomic variables decline and which grow? Why do they do that?
What kinds of economies would benefit the most : What would the impact on the global economy be? What kinds of economies would benefit the most? The least?
Write the least squares equation : BAN203 Regression Analysis - Estimating Relationships Assignment. Construct a scatterplot for the data and graph a second-order polynomial
What types of results could the regression analysis yield : What types of results could the regression analysis yield? How could you use the knowledge gained from the test?
Define a class for a type called fraction : Define a class for a type called Fraction. This class is used to represent a ratio of two integers. Include mutator functionsand allow the user to set
Should dna testing be provided for any type of criminal case : Should DNA testing be provided for any type of criminal case? If not, which cases should be entitled to receive such testing?
Key remote access services and tools : 1. "What are some of the key Remote Access Services and Tools and their benefits?"

Reviews

len1931057

4/6/2018 4:57:48 AM

Please complete fully in excel using formulas. Please use one sheet for each problem 1-8 and the Multiple choice can be located on one sheet labeled MC. Note: Please keep a copy of the computer summary outputs for future reference. You will need them when solving HW Set # 3 problems. Note: Please have your formula sheet handy and watch the "Regression Output Analysis" Excel Video demo file, under the Excel link, before attempting this and next problems.

Write a Review

Applied Statistics Questions & Answers

  How the population mean median and mode compare

Explain how the population mean, median, and mode compare when the population's relative frequency curve is Symmetrical, Skewed with a tail to the left and Skewed with a tail to the right.

  Reporting the analyses for a designated research project

QRM Assignment - Quantitative Research Methods (QRM), Completing and reporting the analyses for a designated research project

  Determine the chi-square value

1.Use α=0.01, and Determine the Chi-Square value, and come to the appropriate conclusion concerning this goodness of fit procedure. *From the Table of Random Numbers...all have a probability of 1/10 "numbers from 0-9"

  Interpret the results and compare the widths of the confiden

In a survey of 616 males ages 18-64, 393 say they have gone to the dentist in the past year. Construct90% and95% confidence intervals for the population proportion. Interpret the results and compare the widths of the confidence intervals. If convenie..

  The mean height for a population

The mean height for a population is 65 inches with a standard deviation of 3 inches.  Let A and B denote the events below: A = The mean height in a random sample of 25 is within 1 standard deviation of the population mean.

  Mike dreskin manages a large los angeles movie theater

8-3 Mike Dreskin manages a large Los Angeles movie theater complex called Cinema I, II, III, and IV. Each of the four auditoriums plays a different film; the schedule is set so that starting times are staggered to avoid the large crowds that would oc..

  How high must a doorway

Heights of swedish men follow a normal distribution with mean 72 in and standard deviation of 5 in. How high must a doorway be so that 90% of swedish men can go through without having to bend?

  Calculate test statistic and p-value for each given sample

Calculate the test statistic and p-value for each sample. State the conclusion for the specified a. What is probability of producing at least 232,000 barrel

  Determine the impact of research - construct the objective

Examine the research problem of this research - construct the objective/s of this research and determine the impact of this research.

  The state of california has a mean annual rainfall

The state of California has a mean annual rainfall of 27.6 inches, whereas the state of New York has a mean annual rainfall of 48.7 inches.  Assume the standard deviation for California is 6.3 inches and for New York is 5.6 inches. Find the probabili..

  Meadow political advisers have no estimate available

Miranda Meadow, a Virginia senatorial candidate wants an estimate of the proportion of the population who will support her in the November election. Assume a 95% level of confidence. Meadow wants the estimate to be within 0.04 of the true proportion...

  State the null and alternative hypotheses

BUS105e Statistics Assignment - Group-based Assignment. Select an appropriate test of hypothesis to determine if the mean on-time arrival rate is different between the two lines. State the null and alternative hypotheses and explain how you develop t..

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd