How will the final model be used

Assignment Help Applied Statistics
Reference no: EM132303212

Statistical Data Analysis Assignment -

In this assignment, you will apply your learning to further analyse the 2013-2014 emergency department (ED) demands at Perth and its connection with weather events. This activity builds on Assignment 1; you may want to review your assignment 1 solution and identify any reusable code. Please start early so that you can identify any skill/knowledge gap and seek support from the teaching staff and other students.

This assignment contains two optional tasks (Task 3.2 and 4.3). You should complete the prescribed tasks before attempting the optional ones.

Application scenario -

You work in a data science team that tries to model the ED demands in the Perth area to improve the demand prediction.

For your convenience, you are provided with the following data links, but you are encouraged to include other relevant data for your analyses.

1. The emergency departments admissions and attendances data set provided by the Department of Health of Western Australia.

2. The daily temperature and precipitation data for the region accessible through the NOAA data APIs.

Of particular relevance is the "Global Historical Climatology Network - Daily" data.

Task 1: Source weather data

From Assignment 1, you have processed data for the ED demands. We still need to find local weather data from the same period. You are encouraged to find weather data online. Besides the NOAA data, you may also use data from the Bureau of Meteorology historical weather observations and statistics. (The NOAA Climate Data might be easier to process.)

Answer the following questions:

1. Which data source do you plan to use? Justify your decision.

2. From the data source identified, download daily temperature and precipitation data for the region during the relevant time period.

3. Answer the following questions:

  • How many rows are in the data?
  • What time period does the data cover?

Task 2: Model planning

Careful planning is essential for a successful modelling effort. Please answer the following planning questions.

1. How will the final model be used? How will it be relevant to the overcrowding problems at our EDs? Who are the potential users of your model?

2. What relationship do you plan to model or what do you want to predict? What is the response variable? What are the predictor variables? Will the variables in your model be routinely collected and made available soon enough for prediction?

3. As you are likely to build your model on historical data, will the data in the future have similar characteristics?

4. What statistical method(s) will be applied to generate the model? Why?

Task 3: Model the ED demands

We will start with simple models and gradually improve them. We will focus on the ED demand variable(s) that you defined in Assignment 1. Let's denote it Y.

Task 3.1: Models for a single facility

Randomly pick a hospital from the ED dataset.

1. Which hospital do you pick?

2. Fit a linear model for Y using date as the predictor variable. Plot the fitted values and the residuals. Assess the model fit. Is a linear function sufficient for modelling the trend of Y? Support your conclusion with plots.

3. As we are not interested in the trend itself, relax the linearity assumption by fitting a generalised additive model (GAM). Assess the model fit. Do you see patterns in the residuals indicating insufficient model fit?

4. Augment the model to incorporate the weekly seasonality. Compare the models using the Akaike information criterion (AIC). Report the best-fitted model through coefficient estimates and/or plots.

5. Analyse the residuals. Do you see any remaining correlation patterns among the residuals?

6. Is your day-of-the-week variable numeric, ordinal, or categorical? Does the decision affect the model fit?

(Optional task) Task 3.2: Models for all hospitals

Now fit a GAM for each hospital.

1. Use the map function to rerun your Task 2.1 code on all nine Perth hospitals.

2. Plot the trends and residuals. What patterns do you see? Given what you found in Assignment 1, do you gain any new understanding of the ED demands?

Task 4: Heatwaves and ED demands

The connection between heatwaves and the ED demands is widely reported, as in this news article.

In this task, you will try to measure the heatwave and assess its impact on the ED demands.

Task 4.1: Measuring heatwave

1. John Nairn and Robert Fawcett from the Australian Bureau of Meteorology have proposed a measure for the heatwave, called the excess heat factor (EHF). Read the following article to understand the definition of the EHF.

2. Use the NOAA data to calculate the daily EHF values for the Perth area during the relevant time period. Plot the daily EHF values.

Task 4.2: Models with EHF

Use the EHF as an additional predictor to augment the model(s) that you fitted before. Report the estimated effect of the EHF on the ED demand. Does the extra predictor improve the model fit? What conclusions can you draw?

(Optional task) Task 4.3: Extra weather features

Can you think of extra weather features that may be more predictive of ED demands? Try incorporating your feature into the model and see if it improves the model fit.

Task 5: Reflection

Answer the following questions:

1. We used some historical data to fit regression models. What are the limitations of such data, if any?

2. Regression models can be used for 1) understanding a process, or 2) making predictions. In this assignment, do we have reasons to choose one objective over the other? How would the decision affect our models?

3. Overall, have your analyses answered the questions that you set out to answer?

What to submit - Submit the following files

1. An MS Word or PDF file containing your answers to all the assignment questions.

2. An R Notebook file Assignment2_submission.Rmd containing all your code. The file should be able to run. Include sufficient comments so that the script can be understood by your marker. Indicate all the packages that need to be installed separately.

Attachment:- Assignment Files.rar

Reference no: EM132303212

Questions Cloud

What was the topic of the research : What are two additional follow-up questions that you have based on this research? Why did you choose these follow-up questions?
Chemical engineering point of view of safety awareness : Chemical engineering point of view of safety awareness - define the importance of safety awareness among the residents and employees at Duqum refinery
Conduct inductive and deductive research : SOCIOLOGY 331:RESEARCH METHODS- Social setting: this should be a public place such as a park, mall, restaurant, etc.
How willie became a juvenile delinquent : Discuss how each theory can explain how Willie became a juvenile delinquent. After you have applied each theory choose one of the four theories.
How will the final model be used : SIT741 - Statistical Data Analysis Assignment, Deakin University, Australia. Model planning - How will the final model be used
Create code to generate invoices for courses : ITECH2306 - AGILE CODING - Federation University - provide a class diagram for every user authored class in your system. This UML class diagram may be generated
Discuss the extent to which behind the scenes womens work : Discuss the extent to which behind the scenes women's work is still taken for granted in both schools and the workplace today.
Comparing the content of the article to the recommended : A sound research study includes all the steps highlighted in previous weeks: reviewing existing literature, focusing a research question.
Research the subject and existing action plans designed : Select a current social issue related to the rights of ethnic or social groups. Research the subject and existing action plans designed to solve the issue.

Reviews

len2303212

5/11/2019 1:53:18 AM

This assignment contains two optional tasks (Task 3.2 and 4.3). You should complete the prescribed tasks before attempting the optional ones. What to submit - By the due date, you are required to submit the following files to the assignment Dropbox in CloudDeakin. An MS Word or PDF file containing your answers to all the assignment questions. An R Notebook file Assignment2_submission.Rmd containing all your code. The file should be able to run. Include sufficient comments so that the script can be understood by your marker. Indicate all the packages that need to be installed separately.

len2303212

5/11/2019 1:53:12 AM

Marking criteria - Your submission will be marked using the following criteria. Showing good effort through completed tasks. Applying statistical thinking to understand the problems and to identify solutions. Applying statistical programming skills to obtain data and to process them for data analysis. Applying regression modelling techniques to discover and quantify relationships among variables. Demonstrating creativity and resourcefulness in solutions. Showing attention to details through a good quality assignment report. Bonus mark may be awarded for completing optional tasks.

Write a Review

Applied Statistics Questions & Answers

  Comment on the overall adequacy of the final model

Compute the difference in the average birthweight of babies of indigenousandnon-indigenous mothers - Comment on the overall adequacy of the finalmodel.

  Develops students ability to interpret interpretation

STT100 Statistics for Business Assessment Task - This assignment develops students' ability to interpret interpretation and analyseis of statistics

  A data set is normally distributed with mean

Assume a data set is normally distributed with mean 160 and standard deviation 25. If the data set contains 300 data values, approximately how many of the data values will fall within the range 110 to 210?

  Difference between poisson and binomial distributions

Please explain what the difference between Poisson and Binomial distributions are?

  Distinguish elements of case study or ethnography

Distinguish elements of case study or ethnography. Describe data collection in case study or ethnography.

  Draw a process map for the above case

AYN443 - Electronic Commerce Cycles MYOB Assignment. Jeff is an engineer working for RoadTrans. On the 12 February he sends an email to the Purchasing Department requesting them to order a MacBook Pro. Julie, the Purchasing Officer creates a purcha..

  Degrees of freedom for the chi-square hypothesis tests

How are the degrees of freedom for the chi-square hypothesis tests different from those of most other hypothesis tests? In most previous hypothesis tests, the degrees of freedom have been based on sample size.

  Calculate the weekly return and construct a histogram

ECON 1030-BUSINESS STATISTICS PROJECT- Your task is evaluate recent prices of Bitcoin and based on this evidence whether individuals should invest in Bitcoin

  What is the average rating for all cbc movies

What is the average rating for all CBC movies? How about ABN movies and BBS movies and create a line graph of the monthly average ratings for CBC for the year

  What is the number of observations in the sample

1)What is the number of Observations in the sample? Write the least squares regression (prediction) equation. Test the usefulness of variable x2 in the model at alpha =.05. Calculate the t statistic and state your conclusions

  Probability that the mean fare is between $20 and $23

What is the probability that the mean fare is between $20 and $23? Please type the solution and the answer. Use the formula editor to enter numeric values and formulas.

  Question 1a the x and y components of fluid velocity in a

question 1a the x and y components of fluid velocity in a two-dimensional flow field are ux and v -y respectivelyi

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd