Reference no: EM132166681
STATISTICAL ANALYSIS PROJECT
This project leads you through a statistical analysis of fuel price data from an Australian state.This data was obtained from PetrolSpy Australia for a given day from arandomly selected sample of petrol stations in an Australian state with the price per litre of Unleaded 91 and Diesel recorded.
Part A covers parts of Topic 1, Part B parts of Topics 5 and 6 and Part Cparts of Topics 7 to 9.
You will need to work on this project throughout Session 3
Project Data
The data set provided contains 10 samples of fuel prices from 80 randomly selected petrol stations within an Australian state on a specified day.
To obtain your data
(1) Click on the Project Datafile. This will download an Excel file.
(2) Select the 4 columns (Town/Suburb to Diesel (Cents per Litre)) of data for the sample specified by the last digit of your student ID number.
(3) Copy this into a new Excel file.
There are 10 sample data sets each of 4 columns (Town/Suburb to Diesel (Cents per Litre))
Your sample number matches the last digit of your SCU student ID number. For example, if your student ID number ends in 1 your sample is Sample 1. For this sample you will be analysing fuel prices for New South Wales on 8 June 2018, using the sample data in columns F to I.
Project Situation
Oz-Fuel-Watch regularly analyses fuel prices in various Australian states.
As a research assistant forOz-Fuel-Watch, you are analysing the data for the day and state specified by your sample. For example, if your student ID number ends in 8 your sample is Sample 8. That is, you willbe analysingfuel prices for New South Wales on 4 September 2018, using the sample data in columns AO to AR.
In each part of the project you are required to analyse your sample data in response to the given questions and provide a written answer. You can assume that the written answers are components of a longer report on fuel prices.
Project Preparation
You are expected to use Excel when completing the project.
Your written answers presenting your findings and conclusions should be considered as a part of a larger report on Australian fuel prices. Each written answer should be a word document into which your Exceloutput has been copied
In addition, your statistical workings for Parts B and C should appear as appendices to your written answers. These should include all necessary steps and appropriate Excel output.
Each part of the project should be submitted as a SINGLE Word document, with appropriate Excel output added.
Notes
- You should not need to read beyond the study guide and textbook to complete the project.
Data Analysis Project - Part A
Purpose: To
- introduce you to the project data, situation and Excel
- use Excel to graph data and calculate summary statistics
- interpret and communicate Excel results.
Full marks for Part A will be given for successful submission of an acceptable attempt.
Part A Question
Oz-Fuel-Watch has asked you to provide information on the price of either Unleaded 91 or Diesel on the day and in the state specified by your sample. In particular, information on the minimum and maximum price and the average price. Also required is an estimated price range for your given fuel.
Note:
- If your family name begins with A to M analyse Unleaded 91 prices.
- If your family name begins with N to Z analyse Diesel prices.
Complete the following tasks
1) Download and save your data.
2) Download the Project Part A cover sheets, name and save this file as
"Family Name_First Name_Part_A_Campus".
3) Enter your Sample Number and Fuel on page 2 of the Part A coversheets.
4) Statistical Tasks
UsingUnleaded 91 (third column of your data)or Diesel (fourth column of your data) explore theprice of your fuel by using Excel to
- Construct a frequency histogram or polygon
- Calculate descriptive statistics
5) Written Task - Component of a Longer Report
Using the instructions given on page four of the Part A coversheets, introduce your data and the results of your investigation ofprices of your specified fuel on the day and in the state specified by your sample.
This should beone to threepages and 300 to 500 words.
Use an appropriate style, without statistical jargon and equations, to clearly communicate your results.
Written Task - Component of a Longer Report
- 300 to500 words and one to three pages - marks will be deducted if this is greatly exceeded.
- To obtain full marks must:
- Be well structured.
- Clearly communicate the results of the Excel output in language appropriate for your audience.
- Include appropriate graph and descriptive statistics.
- Provide information on the average price of your specified fuel on the day and in the state specified by your sample. Also, how fuel price varies and if there is any pattern.
- Provide an estimated range for the price of your specified fuel on the day and in the state specified by your sample.
- Marks will be deducted if:
- There is little or no comment on, or interpretation of, the Excel output.
- Unnecessary statistical jargon and equations appear.
- It is confusing or not readable.
- It is handwritten.
- For each major spelling and/or grammatical error half a mark will be deducted, up to a maximum of two marks.
- Also up to two marks may be deducted for poor structure and/or presentation.
Data Analysis Project - Part B
Purpose: To
- obtain feedback on your submission in Part A and to gain experience in self-evaluation of submitted work
- apply your knowledge of statistical inference to answer questions about fuel prices by analysing the data and communicating the results.
Part B Submission
You should submit one word document consisting of
- Part B coversheets - first four pages, including completed self-marking sheet for Part A with reflection.
- Copy of your Part A submission.
- Written answers for Part B as components of a report- this should follow the format given on page 5 of Part B coversheets
- Appendices for Part B, which contain full statistical working for the required statistical tasks.
Part B Preparation
While the submission date for Part B is Sunday13 January 2019, you should be working on Part B during Weeks 5 to 8.
It is recommended that you follow the following timetable
- Self-marking of, and reflection on, Part A should be completed in Week 5
- Question 1, covering Topic 5, should be completed in Week 6
- Question 2, covering Topic 6, should be completed in Week 8
Tasks
Task 1 Part A Self-Marking - 5 marks
When directed to do so during Week 5 complete the following tasks
1) Open yoursaved copy of your submission for Part A.
2) Replace the Part A coversheets (three pages) with the Part B coversheets (first four pages).
3) Rename and save this file as
"Family Name_First Name_Part_B_Campus".
4) Use the solution template and marking guide provided to mark your submission for Part A. Enter recommended marks on the self-marking sheet for Part A, page 3 of the file in 3) above.
5) Write a short (approximately 200 words) reflection/feedback on your submission and marking of Part A. In particular:
- consider the good aspects of your submission, what did you do well
- identify where you made mistakes, and how you would avoid them in the future
- considerwhat you learnt from submitting and self-marking Part A.
This is to be entered in the space at the bottom ofthe self-marking sheet for Part A.
6) Save file. To be submitted with Part B - due Sunday 13 January 2019.
Task 2 Part B Appendix - Statistical Inference Tasks
The following statistical tasks should appear as appendices to your written answer. This should include all necessary steps and appropriate Excel output.
These appendices should come after your written answerwithin your single Word document for Part B.
Statistical Inference
Choose a level of confidence for the confidence interval in Question 1 and a level of significance for the hypothesis test in Question 2. Enter these values on page 2 of the Part B coversheets along with the sample number and fuel from Part A.
Question 1 and 2 Situation
Previous research undertaken by Oz-Fuel-Watchshows that motorists consider a fuel to be expensive if its price is $1.50 per litre or more. That is, at least $1.50.
Question 1 - Topic 5
Oz-Fuel-Watch has asked you whether motorists would consider the average price of your fuel expensive on the day and in the state specified by your sample.
To enable you to answer this question use Unleaded 91 (third column of your data) or Diesel (fourth column of your data) and an appropriate statistical inference technique to:
Estimate the population mean price of your fuel, Unleaded 91 or Diesel, on the day and in the state specified by your sample.
Question 2 - Topic 6
Past research shows that even when the average price is less than $1.50 per litre, motorists perceive fuel prices to be expensive when the price of a fuel is at least $1.50 at more than 25% of petrol stations in a state.
Oz-Fuel-Watch wishes to know if, using this criteria, the price of your fuel, Unleaded 91 or Diesel, was expensive on the day, and in the state, specified by your sample.
To enable you to provide this information use Unleaded 91 (third column of your data) or Diesel (fourth column of your data) and an appropriate statistical inference technique to answer the following question
On the specified day was the price of your fuel at least $1.50 per litre at more than 25% of petrol stations in the state specified by your sample?
Task 3 - Part B Written Task - Components of a report
For each question, present the results of your calculations, with your interpretation and conclusion as components of a longer report on fuel prices.
Use the instructions given on page five of the Part B coversheets.
This should be a one to three pagesand200 to 400 words.
It should be submitted as a Word file with Excel output included.
Make sure you:
- Introduce each question and put it in context.
- Answer the question in non-statistical language
- Present the results of your intervals or tests without unnecessary statistical jargon
- Include conclusions which answer the given questions.
Read these marking criteria carefully and consider them when preparing Part B.
See the marking and feedback sheet, page 4 of Part B coversheets, for allocation of marks.
Part A Self-Marking
Full marks will be given for an "acceptable self-marking and reflection". This is defined as the majority of errors (in particular major or obvious errors) are recognised and considered in marking and reflection.
Zero orpartial marks will be given if:
- no or minimal reflection
- no self-marking
- major errors are not recognised.
Statistical Calculation
- For the intervals and tests marks will be given for:
- Choice of appropriate statistical technique/s.
- Random variables defined.
- Correct hypotheses for a test.
- Correct Excel output.
- Correct interpretation of results.
Written Task - Components of a longer report
- 200 to 400 words and one to three pages - marks will be deducted if this is greatly exceeded.
- To obtain full marks must:
- Be well structured and analysed
- Answer the questions and clearly communicate the results of the Excel output in language appropriate for your audience.
- Include an introduction to and conclusion for each question.
- Include appropriate Excel output
- For each question the following rubric will be used
Data Analysis Project - Part C
Purpose: To answer questions about fuel prices by applying your knowledge of statistical inference, and regression and correlation. To communicate the results.
Part C Preparation
While the submission date for Part C is Sunday 3 February 2019, you should be working on Part C during Weeks 9 to 11.
It is recommended that you follow the following timetable
- Question 1 covering Topic 7 should be attempted in Week 9
- Question 2 covering Topic 8 should be attempted in Week 10
- Question 3 covering Topic 9 should be attempted in Week 11
Task 1 Part C - Appendix Statistical Inference and Regression and Correlation Tasks (31 marks)
The following statistical tasks should appear as appendices to your written answer. This should include all necessary steps and appropriate Excel output.
These appendices should come after your written answer within your single Word document for Part C.
Question 1 Statistical Inference Topic 7
Capital city fuel prices are often less than elsewhere in the state.
Oz-Fuel-Watch wishes to know if on the day and in the state specified by your sample the mean price of your fuel, Unleaded 91 or Diesel, was less in the capital city than elsewhere in the state.
To enable you to provide this information useLocation (second column of your data) and eitherUnleaded 91 (third column of your data) or Diesel (fourth column of your data) with an appropriate statistical inference technique to answer the following question
On the specified day was the mean price of your fuel less in the capital city than elsewhere in the state specified by your sample?
Questions 2 and 3 Simple and Multiple Linear Regression
Oz-Fuel-Watch is interested in exploring the relationship between Unleaded 91 and Diesel prices.
You are asked to constructa model of this relationship. To do this, first develop a simple linear regression model between Unleaded 91 and Diesel prices. Then develop a multiple linear regression model with location as a second independent variable. Finally chooseand interpret the linear model that best fits your data.
Question 2 Simple Linear Regression Model Topic 8
To explore the relationship between Unleaded 91 and Diesel prices, use your fuel (Unleaded 91/Diesel) as the independent variable and the remaining fuel (Diesel/Unleaded 91) as the dependent variable.
Using this data develop and then explore a simple linear relationship between the two variables by:
- Plotting the data with a scatter plot.
- Calculating the least squares regression equation, correlation coefficient and coefficient of determination.
- Interpreting the gradient and vertical intercept of the simple linear regression equation.
- Interpreting the correlation coefficient and coefficient of determination. Are these values consistent with your scatter plot?
Question 3 Multiple Linear Regression Model Topic 9
To explore if location influencesthe relationship between Unleaded 91 and Diesel prices add Location(second column of your data) as a second independent variable to your simple linear regression model developed in Question 2.
Using this data develop and then explore the relationship between the three variables by:
- Calculating the multiple regression equation, multiple correlation coefficient, and coefficient of multiple determination.
- Interpreting the values of the multiple regression coefficients.
- Interpreting the values of the multiple correlation coefficient and coefficient of multiple determination. Compare these values with the corresponding values for the simple linear regression model.
Then determine the best model to predict the price of your dependent fuel by:
- Using appropriate tests to determine which independent variables make a significant contribution to the regression model.
- Using the results of the above tests to state the simple or multiple regression equation which best fits the data.
Attachment:- statistical-analysis-project-session.rar