Calculate and display the mean square error

Assignment Help Other Subject
Reference no: EM132221899

Project 1 -

Part I: Longitudinal data, sometimes referred to as panel data, track the same sample at different points in time. There are two common formats for longitudinal data, short and long format.

The short form is intuitive and good for presentation, but it's not suited for analysis, such as regression procedure. As we know, PROC REG or PROC GLM only takes in long format data, in which each variable should possesses one and only one column.

Dataset epilepsy.txt from Blackboard is recorded in the short form, where data from each time point have its own column given ID and Treatment.

1) Import the epilepsy data set.

2) Convert it into long format using DATA steps.

3) Since the baseline is an 8-week seizure count and the rest are 2-week counts, convert all seizure counts into weekly rate.

4) Create one table displaying average age and weekly seizure rate at baseline by treatment.

5) Create a scatter plot of age (x axis) vs weekly seizure rate at baseline (y axis) with colored dots based on treatment.

6) Run a regression model, PROC REG / PROC GLM, with Weekly rate as the response, and age, treatment and time as predictors*.

7) Create and display a data set containing the original and the predicted value for each patient

8) Calculate and display the mean square error (MSE).

*Due to the repeated measurement and the type of response, the proper model would be more complicated than basic linear regression but here we would ignore that since the purpose of this project is to practice.

Part II - We need use cross-validation method to test the predictive ability of our model, since it's not appropriate to use the data which model is built on to test the model.

For each patient i,

1) Modify the original data, by deleting his/her observation, so the model building process would not include this ith observation.

2) Build the model and output the estimated values, and save the predicted values of seizure count belonging to the ith patient.

3) Create a %macro to do (1) and (2), and use %do loop to repeat these steps for each patient.

Combine the results,

4) Create and display a data set containing the original and the predicted value for each patient.

5) Merge them with original response values.

6) Calculate and display the mean square error (MSE).

Please clean your final output by suppressing unnecessary output that are not asked. As usual, comment each statement you used.

Project 2 -

Part I (Redo Project 1):

1) Import the epilepsy data previously used in project 1.

2) Convert from short to long format in R.

3) Redo Part II of project 1 in R.

Part II (Plot):

Using the result obtained and ggplot2 package, to create a scatter plot of predicted values vs. original values, and have

1) ID numbers (1 to 59) as the markers.

2) The color of the markers depends on age.

3) Two panels based on treatment using facet_grid().

4) x and y variables labeled properly.

Attachment:- Assignment Files.rar

Reference no: EM132221899

Questions Cloud

Explain of two quantitative and qualitative measures : Explain of two quantitative and/or qualitative measures you will employ in your measurement strategy and why you selected them for your particular OIP.
Role overload requires hiring more highly trained workers : Personality is at the center of the four layers of diversity model? Role overload requires hiring more highly trained workers?
Critically assess the status of your term project : Critically assess the status of your term project when it has reached a milestone during execution.
Ethical dilemma-plasticallity right : Donna Canova is the environmental compliance manager for a small plastics manufacturing company Platicallity Inc.
Calculate and display the mean square error : Longitudinal data, sometimes referred to as panel data, track the same sample at different points in time. Calculate and display the mean square error
Differentiate ethnocentricity and polycentricity : What are the causes of the directional imbalance in the global freight? Differentiate ethnocentricity and polycentricity?
What assumptions are made in the computation : How can one determine the probability that a project will be completed by a certain date? What assumptions are made in the computation?
What is the definition of Nursing : Activity - Nursing: The Scope and Standards of Practice. What is the definition of Nursing? What is the definition of the "how" of nursing
Why is high validity more important than high reliability : Which of the following describes a basic reliability test? Why is high validity more important than high reliability?

Reviews

Write a Review

Other Subject Questions & Answers

  Cross-cultural opportunities and conflicts in canada

Short Paper on Cross-cultural Opportunities and Conflicts in Canada.

  Sociology theory questions

Sociology are very fundamental in nature. Role strain and role constraint speak about the duties and responsibilities of the roles of people in society or in a group. A short theory about Darwin and Moths is also answered.

  A book review on unfaithful angels

This review will help the reader understand the social work profession through different concepts giving the glimpse of why the social work profession might have drifted away from its original purpose of serving the poor.

  Disorder paper: schizophrenia

Schizophrenia does not really have just one single cause. It is a possibility that this disorder could be inherited but not all doctors are sure.

  Individual assignment: two models handout and rubric

Individual Assignment : Two Models Handout and Rubric,    This paper will allow you to understand and evaluate two vastly different organizational models and to effectively communicate their differences.

  Developing strategic intent for toyota

The following report includes the description about the organization, its strategies, industry analysis in which it operates and its position in the industry.

  Gasoline powered passenger vehicles

In this study, we examine how gasoline price volatility and income of the consumers impacts consumer's demand for gasoline.

  An aspect of poverty in canada

Economics thesis undergrad 4th year paper to write. it should be about 22 pages in length, literature review, economic analysis and then data or cost benefit analysis.

  Ngn customer satisfaction qos indicator for 3g services

The paper aims to highlight the global trends in countries and regions where 3G has already been introduced and propose an implementation plan to the telecom operators of developing countries.

  Prepare a power point presentation

Prepare the power point presentation for the case: Santa Fe Independent School District

  Information literacy is important in this environment

Information literacy is critically important in this contemporary environment

  Associative property of multiplication

Write a definition for associative property of multiplication.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd