Create categories of age and label the categories

Assignment Help Advanced Statistics
Reference no: EM132260148

Statistics Assignment -

You should include relevant statistical (R and Stata) code; please use a different font (Courier 10pt is recommended) and format it neatly. Recommend to include full copies of commented scripts as an appendix.

QUESTION 1 - Answer all parts of question using R

The data set required for this question is called heart transplant data (HTD.csv). This data set contains 11 variables

Table 1 - Description of variables in the HTD.csv dataset

Variable

Description

Patient id

Unique patient identifier

Year of acceptance

Year admitted

Age

Age of the patient

Survival Status

Describes if the patient is alive or dead

1= dead

0= alive

Survival time

Number of months survived

Prior surgery

Surgery history

1= Yes

0= No

Transplant status

Has the patient received transplant or not

1= transplanted

0= not transplanted

Waiting time for transplant

Number of months the patient had to wait for transplant

Mismatch on alleles


Mismatch on antigen


Mismatch score


a. Read data into R

1. Label all variables and apply formats to the categorical variables.

2. Produce a list of the first 10 records in the dataset.

3. Produce a list of the first 10 records if the patient survived.

4. Produce two separate tables which show (1) the number of people transplanted and (2) the number of people who had prior surgery. Also do a cross table of transplanted and people who had prior surgery.

5. Create categories of age and label the categories (<40, 41-45, 46-50, 51-55, 56-60, 60+)

b. Check to see whether any people have a record of transplant status after their death. Produce the list of records if any.

c. Are they any patients who were alive, had prior surgery, and missing information about mismatch on antigen, tabulate the percentage and state how many such cases are there?

d. Find the mean, median, and inter-quartile range of waiting time for transplant among people who died.

e. Do a cross tabulation of survival status and transplant status, present the table with variable and value labels.

f. Present a table showing the number of people who survived within each age category.

QUESTION 2 - Answer all parts of question using Stata

The data set for this question is contained in the file breakfastdata.dat. This data set has 15 variables.

Table 2 - Description of variables in the breakfasdata.dat dataset

Variable

Description

Name

Cereal name

Mfr

Cereal manufacturer

A= American Home Food Products

G= General Mills

K=Kelloggs

N=Nabisco

P=Post

Q=Quaker oats

R=Ralston Purnia

Type

Type

C= cold

H=hot

Calories

Calories (number)

Protein

Protein (g)

Fat

Fat (g)

Sodium

Sodium(mg)

Fiber

Dietary fiber (g)

Carbo

Complex Carbohydrates (g)

Sugars

Sugars(g)

Shelf

Display shelf (coded as 1,2,3)

Potass

Potassium (mg)

Vitamins

Vitamins and Minerals

0-      None added

25- "enrich often to 25% FDA Recommended"

100- "100% of FDA recommended"

Weight

Weight  (in ounces)

Cups

Cups per serving

a) Apply variable and value labels to each of the variables using the description provided in Table-2. Recode -1 category in each variable to missing. Produce univariate statistics for the following variables: type, mfr, vitamins, sugars, calories, protein. Present your descriptive table as if it is ready to be published in a report.

b) Produce 8 records observations that has cereal manufactured by general mills and 8 records that show cereal manufactured by Kelloggs and also show 100% FDA recommended vitamins and minerals in both sets of records.

c) Create sodium categories (6 possible categories, including 0 as one of the categories), describe the criteria used, define the variable and value labels.

d) Using the categorical sodium variable created above, provide the syntax to compute the following table.

sodc

Type and Manufacturer

C

H

A

G

K

N

P

Q

R

A

G

K

N

P

Q

R

0

 

 

6

4

2

3

1

1

 

 

1

 

1

 

1

 

9

6

1

5

2

1

 

 

 

 

 

 

 

2

 

7

3

 

2

 

3

 

 

 

 

 

 

 

3

 

4

5

 

 

2

2

 

 

 

 

 

 

 

4

 

2

2

 

 

 

1

 

 

 

 

 

 

 

5

 

 

1

 

 

 

 

 

 

 

 

 

 

 

e) Provide appropriate descriptive summaries of each variable (calories, protein, fat, sodium, fiber, carbo, sugars, and potass) for each cereal manufacturer. (State why you think your measure of descriptive summary is appropriate in one sentence).

Note - Need to solve using STATA or R programming.

Attachment:- Assignment Files.rar

Reference no: EM132260148

Questions Cloud

In the business administration and management profession : In the Business Administration and Management Profession
How hp is following the four steps for fostering innovation : How Meg Whitman could use Lewin’s and Kotter’s models of change to increase the probability of achieveing positive organizational change.
What will you do to ensure relevant financial management : How and to what extent are financial controls critical to your organization? Why? What will you do to ensure relevant financial management?
Production system but needs further performance improvements : What can the Database Administration (DBA) do if the organization has no additional money for its production system but needs further performance improvements?
Create categories of age and label the categories : Statistics Assignment - Create categories of age and label the categories (
Determine the environment and the audience : To prepare: Determine the environment and the audience who you believe might benefit from this learning. Consider why you believe this audience would benefit.
Provide a brief commentary of key relevant literature : How the methods were deployed to provide a worthwhile contribution to theory and research - analysis section which documents the research questions
Performance are most important to your organization : Which measures of performance are most important to your organization? Why?
How should the decision-making process be handled : How should the decision-making process be handled? How should the intrinsic as well as the instrumental value of a human life be determined?

Reviews

Write a Review

Advanced Statistics Questions & Answers

  Relationship between speed, flow and geometry

Write a project proposal on relationship between speed, flow and geometry on single carriageway roads.

  Logistic regression model

Compute the log-odds ratio for each group in Logistic regression model.

  Logistic regression

Foundations of Logistic Regression

  Probability and statistics

The tubes produced by a machine are defective. If six tubes are inspected at random , determine the probability that.

  Solve the linear model

o This is a linear model. If your model needs a different engine, then you need to rethink your approach to the model. Remember, there are no IF, Max, or MIN statements in linear models.

  Plan the analysis

Plan the analysis

  Quantitative analysis

State the hypotheses that you are going to test.

  Modelise as a markov chain

modelise as a markov chain

  Correlation and regression

What are the degrees of freedom for regression

  Construct a frequency distribution for payment method

Construct a frequency distribution for Payment method

  Perform simple linear regression

Perform simple linear regression

  Quality control analysis

Determining the root causes

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd