A retrospective sample of males in a heart-disease

Assignment Help Basic Statistics
Reference no: EM13577553

All work must be done independently.

A retrospective sample of males in a heart-disease high-risk region of the Western Cape, South Africa. There are roughly two controls per case of CHD. Many of the CHD positive men have undergone blood pressure reduction treatment and other programs to reduce their risk factors after their CHD event. In some cases the measurements were made after these treatments. These data are taken from a larger dataset, described in Rousseauw et al, 1983, South African Medical Journal.

There are 463 observations in the dataset. The variables in the dataset are:

sbp - systolic blood pressure
tobacco - cumulative tobacco (kg)
ldl - low density lipoprotein cholesterol adiposity
famhist - family history of heart disease (Present, Absent)
typea - type-A-behavior
obesity
alcohol - current alcohol consumption age - age at onset
chd - response, coronary heart diseease

The data can be found and read into R by the following command:

read.table(" https://www-stat.stanford.edu/ tibs/ElemStatLearn/datasets/SAheart.data" , sep=",",head=T,row.names=1)

If you would prefer to analyze this data in using some other statistical package, you will need to export the data from R using something like a write.table command (or some variation thereof).

The following questions are of practical interest:

1. What are significant predictors of CHD ? What would a final model look like and can you provide an estimate of its predictive accuracy (i.e. do model selection and then evaluate predictive accuracy)? What functional forms are most appropriate for the various predictors in your final model ?

2. Since high Idl often precedes a diagnosis of CHD, will a two stage model which first uses ldl as a response in stage 1 and then CHD as a response in stage 2, provide more accurate predictions of CHD than the model built question 1 above ?

3. There are often situations where finding just one obviously best sub-model is difficult. There may be many good competing sub-models. However, you might decide to bring together multiple models to im¬prove predictive performance. Develop a strategy for doing this on this dataset, being careful to clearly compare and contrast (to the single model approach) predictive performance. Also, make sure to clearly motivate your strategy giving enough intuition so that I can follow things easily.

Please provide complete justifications for why you chose a particular mod¬eling strategy including the underlying assumptions you are making. Analyze the data and provide some overall inferences with regards to the questions being posed. Write a (maximum) 5 page report (tables and figures inclusive) that details your analysis. Computer output may be attached as supplemen¬tary material.

Reference no: EM13577553

Questions Cloud

The acceptance scheme for purchasing lots containing a : the acceptance scheme for purchasing lots containing a large number of batteries is to test no more than 75 randomly
During april kaye company accumulated 400 hours of direct : media outlets often have websites that provide in-depth coverage of news and events. portions of these websites are
The thicknesses of six pads designed for use in aircraft : the thicknesses of six pads designed for use in aircraft engine mounts were measured. the results in mm were 40.93
Give the marginal density functions for both random : two electronic components of a missile system work in harmony for the success of the total system. let x and y denote
A retrospective sample of males in a heart-disease : all work must be done independently.a retrospective sample of males in a heart-disease high-risk region of the western
Depreciation insurance and property taxes represent 37000 : excel learning systems inc. was organized on september 30 2014. projected selling and administrative expenses for each
Sampson trucking company allocates the rent costs and : sampson trucking company allocates the rent costs and dispatchers salaries to their different service departments on
Explain how the applications of integer programming differ : explain how the applications of integer programming differ from those of linear programming. why is rounding-down an lp
The company expects to sell 12 of its merchandise for cash : petjoy wholesale inc. a pet wholesale supplier was organized on march 1 2014. projected sales for each of the first

Reviews

Write a Review

Basic Statistics Questions & Answers

  How does the shape of the quantile plot affect

How does the shape of the quantile plot affect your interpretation of the results?

  Examples of the populations parameters

What are examples of the populations parameters we are trying to estimate for sample data?

  Determining the right-tail p-value

Make use of Excel to determine the right-tail p-value.

  Finding hypotheses for statistics

A sample of 20 is taken, resulting in a mean of 16.45 and standard deviation of 3.59. Assume that x is normally distributed and used alpha of 0.5 to test hypotheses.

  Telephone numbers listed in your local directory

Using the telephone numbers listed in your local directory as your population, randomly obtain 20 samples of size 3. From each telephone number identified as a source, take the fourth, fifth, and sixth digits.

  Linear regression model for bp against sodium

Fit a linear regression model for bp against sodium. Do you think this is a model with a good fit? Why?

  A teacher gave a reading test to a class of 5th grade

a teacher gave a reading test to a class of 5th grade students and computed the mean median and mode for the test

  Grand strand family medical center

The Grand Strand Family Medical Center is specifically set up to treat minor medical emergencies for visitors to the Myrtle Beach area. There are two facilities, one in the Little River Area and the other in Murrells Inlet.

  Find average information per symbol without blocks formed

What is the average information per symbol, H(X) without any blocks being formed? What is the average information per symbol, H(X) when the blocks are formed?

  Weighted moving average

Using a 2-month weighted moving average, with weights of 2 for the most recent month and 1 for the previous month develop forecasts sales for March to June inclusive.

  Distinct curved pattern in the plot

A researcher measures a response variable Y and explanatory variable X on each of several objects. A scatterplot of the measurements is as follows.

  Table to give legitimate probability distribution

Fill in the P (X=x) values in the table to give legitimate probability distribution for the discrete random variable X, whose possible values are -4, 1,2,3 and 6.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd