Write a set of SAS code to plot the two series

Assignment Help Other Subject
Reference no: EM132163733

Forecasting Assignment - Regression and Box-Jenkins Methodology

Question 1 - Save the file sexratio.sas7bdat in your TSWORK library. This data consist of annual data for the Australian sex ratio recorded as a percentage from 1796 to 2007, where for example SexRatioPCT of 101 means 101 men for every 100 women.

(a) Run the following SAS code:

data temp; set tswork.sexratio;

proc print; run;

proc gplot;

plot SexRatioPCT*Year; run;

Produce the time series plot of the data.

Describe the general patterns you observe in this series.

Some would argue that the large values observed in the initial 100 or so observations have little relevance to future values of this series. Relate to the history of British settlement of Australia, explain why this is the case.

(b) Run the following SAS code:

proc gplot;

where Year > 1900 ;

plot SexRatioPCT*Year; run;

Produce the time series plot of the data.

There were two significant troughs in the data. Explain what could have caused them.

(c) Run the following SAS code, which forecasts using Exponential Smoothing:

Proc forecast data = temp method=expo trend=2

out = tswork.exp outfull;

where Year > 1900 ;

var SexRatioPCT;

id Year;

run;

proc gplot data=tswork.exp;

where Year > 1900 ;

symbol i=line v="*" h=1;;

plot SexRatioPCT*Year=_TYPE_;run;

Produce the SAS output plot generated from the above code, which contains both the observed and forecasted series, and the 95% CI associated with the forecasts.

What do your forecasts look like? Do you think this is a reasonable forecast (in particular, pay attention to the last 5 observations of your data)?

(d) Modify the above SAS code, using the LAST 10 observations only (Hint: change the line of code for "Year > ....".

Produce the SAS output plot generated from the modified code.

Compare to your result in part (c), what significant difference do you observe? Do you think this forecast is more reliable than the previous one?

Question 2 - Save the SAS data file ARRDEP.sas7bdat in your TSWORK library. This data consists of three sets of monthly Australian arrivals and departures data. Specifically, there are permanent departures (PD) and arrivals (PA), long-term departures (LTD) and arrivals (LTA), and short-term departures (STD) and arrivals (STA). The data runs from 1976 to 2011. In this question we will use only the short-term departures (STD) data.

(a) View the data through Explorer. You will see that there is a "month" variable and twelve dummy variables representing the twelve months of the year. Would you expect there to be trend and seasonality in the STD series? Why?

(b) Run the following SAS code and describe the general patterns that you see.

data temp; set tswork.ARRDEP;

symbol1 i=line;

proc gplot; plot STD*month; run;

(c) Run the following SAS code, which calculate a log transform for the STD series and plots the series. What does this transformation achieve?

data temp; set tswork.ARRDEP;

logSTD = log(STD);

proc gplot; plot logSTD*month; run;

(d) Run the following SAS code, which fits a regression model for the log-transformed STD series. It specifies "month" and the dummy seasonal variables as predictors and asks for the Durbin Watson statistics. Comment on your coefficient estimates and explain if they are consistent with your expectations.

data temp; set tswork.ARRDEP;

logSTD = log(STD);

proc reg;

model logSTD = month jan feb mar apr may jun jul aug sep oct nov

dec/dw; run;

(e) Use the Durbin Watson statistic and a suitable table of critical values to test for first order serial correlation in the residuals. What does this value tell you about the reliability of your regression equation and any forecasts obtained using this model?

(f) Run the following SAS code, which progressively produces ACF and PACF for the log-transformed STD series at different levels of differencing. Base on the ACF, at each step comment on the stationarity of the series.

data temp; set tswork.ARRDEP;

logSTD = log(STD);

/* no differencing */

proc arima data=temp;

identify var=logSTD; run;

/* 1st differenced */

proc arima data=temp;

identify var=logSTD(1); run;

/* both 1st and seasonally differenced */

proc arima data=temp;

identify var=logSTD(1,12); run;

(g) Based on the ACF and PACF from the last step of part (f), select a SARIMA model. Justify the values of p, d, q, P, D and Q you have chosen by commenting on the patterns observed in ACF and PACF.

(h) Now fit your selected model and forecast values for the suitably transformed series for 1 year ahead. Use blue for the forecasts and red for the prediction intervals. The skeleton of the code has been provided for you below, and you are to fill in the missing parts represented by "???".

data temp; set tswork.ARRDEP;

logSTD = log(STD);

proc arima data=temp;

identify var=logSTD(???,???);

estimate p=(???) q= (???) plot;

forecast lead=??? id=month out=forecast; run;

proc print data=forecast; run;

symbol1 i=line v=none c=???;

symbol2 i=line v=none c=???;

symbol3 i=line v=none c=???;

proc gplot data=forecast;

plot (U95 forecast L95)*month/overlay; run;

g) Run the codes you have completed in part (h), and write down the estimated equation for your transformed series using the backshift operator.

(i) Consider the ACF plot for you model residuals = e(t). Include this plot in you assignment and explain what it means. Do you think there is a need to revise your model?

(j) Finally, take the forecasts and data from part (i) and back-transform them (i.e., undo the log-transformation by taking exponential of the data). Then plot your forecasts and confidence intervals. Use blue for the forecasts and red for the prediction intervals, and include the original data as black star points. The skeleton of the code has been provided for you below, and you are to fill in the missing parts represented by "???". Include the final plot to your answer.

data temp; set forecast;

expU95=???;

expL95=???;

expforecast = ???;

STD=exp(logSTD);

symbol1 i=line v=none c=???;

symbol2 i=line v=none c=???;

symbol3 i=line v=none c=???;

symbol4 i=none v=??? c=???;

proc gplot data=temp;

plot (expU95 expforecast expL95 STD)*month/overlay;

run;

Question 3 - Continue with the ARRDEP data set used in Question 2.

(a) Write a set of SAS code to plot the two series with black for arrivals (STA) and red for departures (STD). To obtain better resolution on more recent data, please plot for data observed after month=250.

(b) Run the code you have written and produce the plot.

(c) Describe any significant patterns observed in the data. Would you expect these two series to follow each other closely? Is there any sort of lag effect between the series? Why?

(d) In real-life consulting work, it is often the case that at the end of a project you will be asked to deliver a full set of codes that covers the entire modelling process. Moreover, you are also often required to provide comments throughout the set of codes so that whoever takes over the project after you will have no problem following what you have done and make changes if need be. In this question, you are required to write a full set of SAS codes that will do the following for the permanent departure (PD) series:

1) Plot PD against time;

2) Log-transform PD (call it logPD);

3) Plot logPD against time;

4) Generate ACF and PACF for the log-transformed data on:

a. The Raw series

b. The 1st differenced series

c. The 1st and seasonally differenced series

5) Select a SARIMA model appropriately chosen based on patterns observed in the ACF and PACF;

6) Estimate the chosen SARIMA model;

7) Forecast for 2 years into the future;

8) Save your forecast output into a file called PDforecast;

9) Back-transform the data and your forecasts;

10) Plot the observed data, the forecasts, and the confidence interval.

You must provide appropriate amount of comments at each step of the code to facilitate the reader's understanding of your code. You must ensure that there are no errors in your code and that I can test it on my machine.

Attachment:- Assignment Files.rar

Reference no: EM132163733

Questions Cloud

How to increase middle and late-childhood esteem : Develop a pamphlet that addresses how to increase middle- and late-childhood esteem, to be handed out to parents at their child's school.
What type of cost structures are most common : Why are start up costs so high for information goods and services, and what type of cost structures are most common?
Philosophy of marking up merchandise a maximum : Pharmacies are a relatively new addition to Costco. Costco has a stated philosophy of marking up merchandise a maximum of 14 percent.
Important and essential goods and services for our society : Government creates many important and essential goods and services for our society, but over time, government tends to continue to add goods and services
Write a set of SAS code to plot the two series : STA70004: Forecasting Assignment - Regression and Box-Jenkins Methodology. Write a set of SAS code to plot the two series with black for arrivals
What role did exchange rates play in the crisis : What were the main causes of the Asian financial crisis in 1997? What role did exchange rates play in the crisis?
What is the european sovereign debt crisis : Identify and describe three major causes of the crisis. Detail the evolution of the crisis, and identify three of the countries involved.
Physician practice groups in the market : Baptist Healthcare and Methodist Healthcare are acquiring most of the specialty physician practice groups in the market.
Are dual business models sustainable : What challenges does ANA face in choosing the run two different business models?

Reviews

len2163733

11/11/2018 9:46:28 PM

Note: Include the important time series plots, ACF and PACF graphs as well as the important tables in your answers. Label and number all graphs and tables carefully and refer to them by number in your answer. Please try and submit your assignment before the due date (12th November) or else I may not be able to guarantee your assignment will be marked before your exam. Answers should be clear and concise. You are expected to use SAS for this assignment. Set up a library called TSWORK where all the SAS data files needed for this assignment are stored. Attach your SAS program in the Appendix including only the essential code for your final program. Please contact if you encounter any technical issues with SAS.

Write a Review

Other Subject Questions & Answers

  How such tools are used within the training process

Best Hospital is a 325-bed suburban community hospital that has just merged with a local major academic medical center, the University of Excellence Medical.

  Provide brief summary of the article

Primary Task Response: Within the Discussion Board area, write 300-500 words that respond to the following questions with your research.

  Compare and contrast approaches of development of brain

Compare and contrast approaches of development of brain and of human of Carlson text to Tripartite Man's approach. Consider both scientific approach and biblical approach in your discussion.

  Write a cpu side function that takes and rgb image

Write a CPU side function that takes and RGB image (called input) and converts it to gray scale.

  What do you do next- take family members to school

What do you do next- take family members to school, go to the gym, go to work and interact with coworkers?

  Develop a rationale using evidence-based research

Identify a health care issue that interests you and explain why.Develop a rationale using evidence-based research.

  Identify the historical or current circumstances

Identify specific actions taken by this group; and or specific actions taken by the executive, legislative, or judicial branch to address this inequality.

  Develop an argument to support either side

Some venture capitalists believe that the chosen industry drives a firm's profitability; others believe that an effective management team can drive profits

  explains how to write an Op-Ed piece

You should read the article on LEO that explains how to write an Op-Ed piece, and the two examples provided before commencing this assessed task

  Compare preferred learning strategies to identified strategy

Compare your preferred learning strategies to the identified strategies for your preferred learning style. Appraise how this awareness of learning attributes.

  Discuss how society decides what to define as a crime

Describe how society defines crime. Do not provide a definition - instead, explain how the definition is reached

  Speculate about proposing solutions for global social issue

Speculating about Causes and Proposing Solutions for a National or Global Social Issue. Explore a national or global social issue in your area.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd