Calculating the least squares regression equation

Assignment Help Applied Statistics
Reference no: EM132216058

STATISTICAL ANALYSIS PROJECT -

This project leads you through a statistical analysis of fuel price data from an Australian state. This data was obtained from PetrolSpy Australia for a given day from a randomly selected sample of petrol stations in an Australian state with the price per litre of Unleaded 91 and Diesel recorded.

Part A covers parts of Topic 1, Part B parts of Topics 5 and 6 and Part C parts of Topics 7 to 9.

Project Situation -

Oz-Fuel-Watch regularly analyses fuel prices in various Australian states.

As a research assistant for Oz-Fuel-Watch, you are analysing the data for the day and state specified by your sample. For example, if your student ID number ends in 8 your sample is Sample 8. That is, you will be analysing fuel prices for New South Wales on 4 September 2018, using the sample data in columns AO to AR.

In each part of the project you are required to analyse your sample data in response to the given questions and provide a written answer. You can assume that the written answers are components of a longer report on fuel prices.

Project Preparation -

You are expected to use Excel when completing the project.

Your written answers presenting your findings and conclusions should be considered as a part of a larger report on Australian fuel prices. Each written answer should be a word document into which your Excel output has been copied

In addition, your statistical workings for Parts B and C should appear as appendices to your written answers. These should include all necessary steps and appropriate Excel output.

Each part of the project should be submitted as a SINGLE Word document, with appropriate Excel output added.

Notes

  • You should not need to read beyond the study guide and textbook to complete the project.

Data Analysis Project - Part A

Purpose: To

  • introduce you to the project data, situation and Excel
  • use Excel to graph data and calculate summary statistics
  • interpret and communicate Excel results.

Part A Question - Oz-Fuel-Watch has asked you to provide information on the price of either Unleaded 91 or Diesel on the day and in the state specified by your sample. In particular, information on the minimum and maximum price and the average price. Also required is an estimated price range for your given fuel.

Note:

  • If your family name begins with A to M analyse Unleaded 91 prices.
  • If your family name begins with N to Z analyse Diesel prices.

Data Analysis Project - Part B

Purpose: To

  • obtain feedback on your submission in Part A and to gain experience in self-evaluation of submitted work
  • apply your knowledge of statistical inference to answer questions about fuel prices by analysing the data and communicating the results.

Tasks -

Task 1 Part A Self-Marking -

When directed to do so during Week 5 complete the following tasks

1) Open your saved copy of your submission for Part A.

2) Replace the Part A coversheets (three pages) with the Part B coversheets (first four pages).

3) Rename and save this file as "Family Name_First Name_Part_B_Campus".

4) Use the solution template and marking guide provided to mark your submission for Part A. Enter recommended marks on the self-marking sheet for Part A, page 3 of the file in 3) above.

5) Write a short (approximately 200 words) reflection/feedback on your submission and marking of Part A. In particular:

  • consider the good aspects of your submission, what did you do well
  • identify where you made mistakes, and how you would avoid them in the future
  • consider what you learnt from submitting and self-marking Part A.

This is to be entered in the space at the bottom of the self-marking sheet for Part A.

6) Save file. To be submitted with Part B - due Sunday 13 January 2019.

Task 2 Part B Appendix - Statistical Inference Tasks

The following statistical tasks should appear as appendices to your written answer. This should include all necessary steps and appropriate Excel output.

These appendices should come after your written answer within your single Word document for Part B.

Statistical Inference

Choose a level of confidence for the confidence interval in Question 1 and a level of significance for the hypothesis test in Question 2. Enter these values on page 2 of the Part B coversheets along with the sample number and fuel from Part A.

Question 1 and 2 Situation

Previous research undertaken by Oz-Fuel-Watch shows that motorists consider a fuel to be expensive if its price is $1.50 per litre or more. That is, at least $1.50.

Question 1 - Topic 5

Oz-Fuel-Watch has asked you whether motorists would consider the average price of your fuel expensive on the day and in the state specified by your sample.

To enable you to answer this question use Unleaded 91 (third column of your data) or Diesel (fourth column of your data) and an appropriate statistical inference technique to:

Estimate the population mean price of your fuel, Unleaded 91 or Diesel, on the day and in the state specified by your sample.

Question 2 - Topic 6

Past research shows that even when the average price is less than $1.50 per litre, motorists perceive fuel prices to be expensive when the price of a fuel is at least $1.50 at more than 25% of petrol stations in a state.

Oz-Fuel-Watch wishes to know if, using this criteria, the price of your fuel, Unleaded 91 or Diesel, was expensive on the day, and in the state, specified by your sample.

To enable you to provide this information use Unleaded 91 (third column of your data) or Diesel (fourth column of your data) and an appropriate statistical inference technique to answer the following question

On the specified day was the price of your fuel at least $1.50 per litre at more than 25% of petrol stations in the state specified by your sample?

Task 3 - Part B Written Task - Components of a report

For each question, present the results of your calculations, with your interpretation and conclusion as components of a longer report on fuel prices.

Use the instructions given on page five of the Part B coversheets.

This should be a one to three pages and 200 to 400 words.

It should be submitted as a Word file with Excel output included.

Make sure you:

  • Introduce each question and put it in context.
  • Answer the question in non-statistical language
  • Present the results of your intervals or tests without unnecessary statistical jargon
  • Include conclusions which answer the given questions.

Data Analysis Project - Part C

Purpose: To answer questions about fuel prices by applying your knowledge of statistical inference, and regression and correlation. To communicate the results.

Part C Preparation

While the submission date for Part C is Sunday 3 February 2019, you should be working on Part C during Weeks 9 to 11.

It is recommended that you follow the following timetable

  • Question 1 covering Topic 7 should be attempted in Week 9
  • Question 2 covering Topic 8 should be attempted in Week 10
  • Question 3 covering Topic 9 should be attempted in Week 11

Task 1 Part C - Appendix Statistical Inference and Regression and Correlation Tasks

The following statistical tasks should appear as appendices to your written answer. This should include all necessary steps and appropriate Excel output.

These appendices should come after your written answer within your single Word document for Part C.

Question 1 Statistical Inference Topic 7

Capital city fuel prices are often less than elsewhere in the state.

Oz-Fuel-Watch wishes to know if on the day and in the state specified by your sample the mean price of your fuel, Unleaded 91 or Diesel, was less in the capital city than elsewhere in the state.

To enable you to provide this information use Location (second column of your data) and either Unleaded 91 (third column of your data) or Diesel (fourth column of your data) with an appropriate statistical inference technique to answer the following question

On the specified day was the mean price of your fuel less in the capital city than elsewhere in the state specified by your sample?

Questions 2 and 3 Simple and Multiple Linear Regression

Oz-Fuel-Watch is interested in exploring the relationship between Unleaded 91 and Diesel prices.

You are asked to construct a model of this relationship. To do this, first develop a simple linear regression model between Unleaded 91 and Diesel prices. Then develop a multiple linear regression model with location as a second independent variable. Finally choose and interpret the linear model that best fits your data.

Question 2 Simple Linear Regression Model Topic 8

To explore the relationship between Unleaded 91 and Diesel prices, use your fuel (Unleaded 91/Diesel) as the independent variable and the remaining fuel (Diesel/Unleaded 91) as the dependent variable.

Using this data develop and then explore a simple linear relationship between the two variables by:

  • Plotting the data with a scatter plot.
  • Calculating the least squares regression equation, correlation coefficient and coefficient of determination.
  • Interpreting the gradient and vertical intercept of the simple linear regression equation.
  • Interpreting the correlation coefficient and coefficient of determination. Are these values consistent with your scatter plot?

Question 3 Multiple Linear Regression Model Topic 9

To explore if location influences the relationship between Unleaded 91 and Diesel prices add Location (second column of your data) as a second independent variable to your simple linear regression model developed in Question 2.

Using this data develop and then explore the relationship between the three variables by:

  • Calculating the multiple regression equation, multiple correlation coefficient, and coefficient of multiple determination.
  • Interpreting the values of the multiple regression coefficients.
  • Interpreting the values of the multiple correlation coefficient and coefficient of multiple determination. Compare these values with the corresponding values for the simple linear regression model.

Then determine the best model to predict the price of your dependent fuel by:

  • Using appropriate tests to determine which independent variables make a significant contribution to the regression model.
  • Using the results of the above tests to state the simple or multiple regression equation which best fits the data.

Task 2 - Written Answer - Components of a report

For Question 1 and Questions 2 and 3 combined present the results of your calculations, with your interpretation and conclusions as components of a longer report on fuel prices.

Use the instructions given on page four of the Part C coversheets.

This should be 300 to 700 words and three to six pages.

It should be submitted as a Word file with Excel output embedded.

Make sure you:

  • Introduce each question and put it in context
  • Answer the questions in non-statistical language.
  • Present the result of your calculations and tests without unnecessary statistical jargon
  • Include conclusions which answer the given questions.

In particular, for Question 2

  • Include your scatter plot and discuss any apparent relationship between Unleaded 91 and Diesel prices. Comment on the strength, shape and sign of the relationship.
  • Mention or explain your choice of independent and dependent variables

In particular, for Questions 2 and 3

  • Include and justify the best model.
  • Discuss and interpret the values of the regression and correlation coefficients of the best model.

Note - Please do part B & C.

Attachment:- Assignment Files.rar

Reference no: EM132216058

Questions Cloud

Hours of operation between failures : An electronic monitoring unit averages 3000 hours of operation between failures. When a failure occurs, the unit must be diagnosed and then serviced (repaired).
Determine what the policy is for the company : Based on your reading and on additional research on line, what are the major components of an IT or cyber security policy?
Explain how DHS should handle the situation : Create a new thread. As indicated above, please explain how DHS should handle the situation described in the preceding paragraph.
Recommends relaxing and company strict limits on employee : Would "power to the people " be an effective headline for the section in an analytical report that recommends relaxing and company's strict limits on employee
Calculating the least squares regression equation : MAT10251 STATISTICAL ANALYSIS PROJECT - Calculating the least squares regression equation, correlation coefficient and coefficient of determination
Discuss configuration considerations for installation : Discuss configuration considerations for installation of these components and protocols. Include configuration of the system for remote access.
Review available public documents relative to organization : Review available public documents relative to the organization's corporate mission statement, standards and codes of conduct, and the impact of government.
How do individual needs influence motivation : What role does the design of a total rewards strategy play in the firm achieving its strategic goals? How do individual needs influence motivation?
Big impact on how children develop into adults : ACTIVITY 1 - Explain how you used feedback from writing activity one to write activity two and Discuss how the feedback on writing activity one will help you

Reviews

len2216058

1/15/2019 11:50:28 PM

Please quote for part B & C. Referencing - You are not required to reference. However, as the format of your written answer is a component of a longer report it may be appropriate to reference. In this case, use any consistent referencing style. Furthermore, you are not required to use real references. That is, any reference can be fictitious/fake. For example in my sample solutions, I use the following: Oz-Fuel-Watch (2018). Fuel Price Survey. Oz-Fuel-Watch Reports.

len2216058

1/15/2019 11:50:21 PM

Project Submission - Each part of the project should be a SINGLE Word file with Excel output included. The given cover sheets should be the first pages of your submitted project and are not part of the page limit. DO NOT submit your appendices, which are not part of the page or word limit, for either part B or C as separate files. Ensure that the page setup of your submitted document is A4 Portrait, with an appropriate format so that it is easily readable if printed. Use line spacing of at least 1.5. Please name your file "Family Name_First Name_Part_A/B/C_Campus" For example; Jayne_Nicola_Part_A_Lismore.

len2216058

1/15/2019 11:50:14 PM

Penalties For Incorrect Sample - If you use a sample that does not correspond to the last digit of your student ID number, to be entered on the cover sheet, a maximum of two marks may be deducted, as this causes the marker extra work and frustration. Incorrect Format - If the page setup of your submitted Word file is not as required (that is, A4 Portrait, with appropriate format so that it is easily readable if printed), with at least 1.5 line spacing or your project is not submitted as a single Word document a maximum of two marks may be deducted, as this causes the marker extra work and frustration. In addition, if your file is not named as requested or the required cover sheets are not included or correctly completed a maximum of two marks may also be deducted, as this can cause the marker extra work and frustration.

len2216058

1/15/2019 11:50:07 PM

Part B Submission - You should submit one word document consisting of Part B coversheets - first four pages, including completed self-marking sheet for Part A with reflection. Copy of your Part A submission. Written answers for Part B as components of a report - this should follow the format given on page 5 of Part B coversheets. Appendices for Part B, which contain full statistical working for the required statistical tasks.

len2216058

1/15/2019 11:49:57 PM

Notes: You may need to transform or manipulate your sample data, before using Excel for the required statistical calculations. Use Excel for statistical calculations. You do not need to repeat any Excel calculations by hand. However, make sure that you define your random variables and include any steps not given by Excel. For example, in a hypothesis test include the null and alternative hypotheses, along with the decision to reject or not reject the null hypothesis. Mention any assumptions you need to make. Comment on why the test or confidence interval has been chosen. Make sure you interpret confidence intervals and write a conclusion to hypothesis tests.

len2216058

1/15/2019 11:49:49 PM

Marking Criteria - Part B - Part A Self-Marking Full marks will be given for an "acceptable self-marking and reflection". This is defined as the majority of errors (in particular major or obvious errors) are recognised and considered in marking and reflection. Zero or partial marks will be given if: no or minimal reflection, no self-marking, major errors are not recognised. Statistical Calculation - For the intervals and tests marks will be given for: Choice of appropriate statistical technique/s. Random variables defined. Correct hypotheses for a test. Correct Excel output. Correct interpretation of results.

len2216058

1/15/2019 11:49:42 PM

Written Task - Components of a longer report - 200 to 400 words and one to three pages - marks will be deducted if this is greatly exceeded. To obtain full marks must: Be well structured and analysed. Answer the questions and clearly communicate the results of the Excel output in language appropriate for your audience. Include an introduction to and conclusion for each question. Include appropriate Excel output. For each question the following rubric will be used. For each major spelling and/or grammatical error half a mark will be deducted, up to a maximum of two marks. Also up to two marks may be deducted for poor structure and presentation.

len2216058

1/15/2019 11:49:34 PM

Notes: You may need to transform or manipulate the given data, before using Excel for the corresponding statistical calculations. Use Excel for the statistical calculations. You do not need to repeat any Excel calculations by hand. However, make sure that you define your random variables and include any steps not given by Excel. For example, in a hypothesis test include the null and alternative hypotheses, along with the decision to reject or not reject the null hypothesis. Mention any assumptions you need to make. In Question 2 fit a linear model even if from your scatter plot you decide that a non-linear relationship better fits the data or that no apparent relationship exists. However, mention this in your written answer and/or corresponding appendix. Comment on why a test has been chosen. Make sure you write conclusions to hypothesis tests.

len2216058

1/15/2019 11:49:25 PM

Marking Criteria - Part C - Read these marking criteria carefully and consider them when preparing Part C. See the marking and feedback sheet, page 3 Part C coversheets, for allocation of marks. Statistical Calculations - For the statistical inference calculations (Questions 1 and 3) marks will be given for: Choice of appropriate statistical technique/s. Random variable/s defined. Correct hypotheses for tests. Correct Excel output. Correct interpretation of results. To obtain full marks your scatter plot (Question 2) must be correct, including correct labels on both axes and a title. Marks will be deducted if; Graph incorrect, Excel not used, Axes incorrectly or not labeled, Incorrect independent and dependent variables and No title. Scale on axes distorts graphs.

len2216058

1/15/2019 11:49:17 PM

For the regression and correlation coefficients (Questions 2 and 3) use either: The Regression command in Data Analysis and copy resultant tables. Or the simple/multiple regression command in PhStat and copy the resultant tables. Or the Simple Linear and Multiple Regression workbooks and copy the resultant tables. Or for simple linear regression (Question 2) insert a trendline on a scatter plot, with both the equation and value showing; you will then need to manually calculate value of r. Note: Check sign of r. For the regression and correlation coefficients (Questions 2 and 3) marks will be deducted if Excel is not used and also for incorrect equations or coefficients, so check: Your independent and dependent variables. Your sample size.

len2216058

1/15/2019 11:49:08 PM

Written Answer - Components of a Report - 300 to 700 words and three to six pages - marks will be deducted if this is greatly exceeded. To obtain full marks must: Be well structured and analysed, Clearly communicate the results of the Excel output in language appropriate for your audience, Include an introduction to each question and your conclusions, Include appropriate Excel output, Answer the questions in non-statistical language. Marks will be deducted if: There is little or no comment on, or interpretation of, the Excel output, Unnecessary statistical jargon and equations appear, It is confusing or not readable, For each major spelling and/or grammatical error half a mark will be deducted, up to a maximum of two marks and Also up to two marks may be deducted for poor structure and presentation.

Write a Review

Applied Statistics Questions & Answers

  A fair playing one of the midway games

1. Imagine yourself at a fair playing one of the midway games. Pick a game and calculate the expected value and post your results along with how you calculated them. For example, you may decide to throw a basketball to try to win a $10 bear. You paid..

  Sixto sanchez is the owner of suburban stylists

5. Sixto Sanchez is the owner of Suburban Stylists. He is evaluating the service level provided to walk-in customers. Because he is enrolled in an MBA program at Eastern University, Sixto decides to sample walk-in customers for the next two w..

  Low air pressure on a randomly chosen car

Let X represent the number of tires with low air pressure on a randomly chosen car.

  What is the standard error of his prediction

What is the standard error of his prediction

  Coffee pot attendant

It is 1581 Anno Domini.  At the Undergraduate School of UMUC, besides Assistant Academic Director of Mathematics and Statistics, I am also the Undergraduate School-appointed CPA, Coffee Pot Attendant.

  Heights of the tulips in the greenhouse of rotterdam''s fanta

Suppose the heights of the tulips in the greenhouse of Rotterdam's Fantastic Flora follow a continuous uniform distribution with a lower bound of 7 inches and an upper bound of 16 inches. You have come to the greenhouse to select a bouquet of tulips,..

  1 an art professor was interested in seeing what size group

1. an art professor was interested in seeing what size group is best to prime the pump so to speak to encourage

  Let g = (v, e) be an undirected graph with v

Let G = (V, E) be an undirected graph with V = {1, . . . , K}. Let C1, . . . , CL be the set of maximal cliques in G. Let p(x) = (1/Z)*exp [Sum_l=1toL(ψ_l(x_Cl))]

  Statistic is used to test a hypothesis a regression equation

Which statistic is used to test a hypothesis a regression equation?Select one:a. t-statisticb. z-statistic

  Find the pmf for x

Suppose Ana has a pair of dice (the traditional six-sided kind). Let X = the difference of the largest minus the smallest number showing on the dice. Find the PMF for X.

  Construct a bayesian network for the object recognition

Determine whether this data already provides an expected relative error introduced by the simulation of below 5%. If not, compute how many new simulations you should run in order to reduce the expected relative error introduced by the simulation b..

  What is the number of observations in the sample

1)What is the number of Observations in the sample? Write the least squares regression (prediction) equation. Test the usefulness of variable x2 in the model at alpha =.05. Calculate the t statistic and state your conclusions

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd