Create a dummy variable based on the variable gender

Assignment Help Basic Statistics
Reference no: EM13799467

A problem of interest to health officials (and others) is to determine the effects of smoking during pregnancy on infant health. One measure of infant health is birth weight; a birth weight that is too low can put an infant at risk for contracting various illnesses. Since factors other than cigarette smoking that affect birth weight are likely to be correlated with smoking, we should take those factors into account.

1. Open the Excel file from Moodle. You should see a worksheet that prompts you for your name and CSUN ID number. Fill these items in before proceeding. On the second worksheet named Raw Data you will find the data for the assignment. Here are the variable names and their descriptions:

2. The data you see on the Raw Data worksheet is locked and can't be modified. Copy all of the data and paste it into the worksheet named Modified Data. If you mess up your data at some point, you can retrieve it from the Raw Data worksheet.

3. In the Modified Data worksheet create four dummy variables.

(a) Create a dummy variable based on the variable gender. The dummy should equal 1 for male children and 0 otherwise. Name this variable gender dum.

(b) Create a dummy variable based on the variable white. The dummy should equal 1 for white children, 0 otherwise. Name this variable white dum.

(c) Create a dummy variable based on the variable nutrition. The dummy should equal 1 for mother's who took a nutrition class, 0 otherwise. Name this variable nut dum.

(d) Create a dummy variable based on the variable married. The dummy should equal 1 for mother's who are married, 0 otherwise. Name this variable married dum.

4. The values for family income are missing from the main data set but can be found on the Family Income worksheet. Match the family income data to the other data using the social security numbers on both worksheets. There are almost 1,400 observations, so you obviously can't match them one-by-one. But you can do the matching easily in Excel using techniques we learned in the computer lab.

5. Perform a simple regression using the following model:

bweight = α + β.cigs + ε

Name the worksheet with the regression output regression 1. Expand the columns as needed to make the results look nice.

6. Fill in the values for the estimated coefficients and other statistics on the Answers worksheet.

You will need to copy and paste your results from regression 1 into the appropriate cells. Do not round your answers.

7. Fill in the answers to the following questions on the Answers worksheet.

(a) What is the meaning of the slope coefficient and the intercept?

(b) Explain the estimated effect of cigarette smoking on birth weight.

(c) Do the coefficients have the signs you would expect?

(d) Are the coefficients statistically significant at the 95% confidence level?

(e) What does the R2 value tell you?

Be sure to put your explanations of slope coefficients in terms of the original units of measure.

8. Now examine the relationship between cigarette smoking and birth weight visually. Create a new worksheet tab named chart. On the new tab, create a scatter plot with trend line showing the linear relationship. The birthweight variable should be on the y-axis and the number of cigarettes smoked should be on the x-axis. Make the chart look pretty by removing the gridlines and labeling each axis.

9. Now perform a multiple regression, using the explanatory variables





-gender dum

-white dum

-married dum

-nut dum

-moth hgt

-gest age

Expand the columns to make the results look nice. Name the worksheet with the new regression output regression 2.

(a) On the Answers worksheet, fill in the results from the regression 2 worksheet and respond to the questions below as in item 7 above.

(b) Compare your results from this regression to the previous one. Has the coefficient for cigs changed? If so, explain why. What can you say about the goodness-of-fit for regressions 1 and 2? Write your responses on the Answers worksheet

(c) Does the second regression model violate the basic assumption that the explanatory variables must be uncorrelated with the error term? Explain. If the assumption was violated, provide a potential solution.

Attachment:- excel.xlsx

Reference no: EM13799467

Questions Cloud

Explain the concept of transformational leadership : Explain the concept of transformational leadership. In what ways do transformational leaders affect subordinates
What is continental margin : What is Continental Margin? Describe the different littoral and oceanic zones along with their biologic terms that are classed here.
Discuss the advantages of using teams for innovation : Discuss the advantages of using teams for innovation. What is the role of managers in promoting innovation? Provide an example where you worked in a team that was innovative
Individual assignment business plan : Individual Assignment Business Plan
Create a dummy variable based on the variable gender : Create a dummy variable based on the variable gender. The dummy should equal 1 for male children and 0 otherwise. Name this variable gender dum.
The orientation of reynolds toward tobacco regulation : Describe the public policy inputs, goals, tools, and effects that can be found in this Discussion Case.
What does it mean for a quantity to drop off : While completing the experiment Charges and Fields, make sure to keep the following guiding questions in mind: What does it mean for a quantity to "drop off"? And r², r³, or even faster
Corporate social responsibility related issues : What are the potential consequences of your chosen social initiative and policy?
Classification of pluto : There has been a scientific debate raging for many years over the classification of Pluto. Pluto recently lost its status as a planet. Do you feel it should be classified as a planet or is it just a large version of a captured comet or KBO?


Write a Review

Basic Statistics Questions & Answers

  Show the cumulative relative frequency distribution what

a doctors office staff studied the waiting times for patients who arrive at the office with a request for emergency

  Probability that sample proportion p will be given value

Respond to the questions? In other words, what is the probability that the sample proportion p will be at least 75/300 = 0.25?

  Suppose that 1 of a large population carries antibodies to

enzyme immunoassay tests are used to screen blood specimens for the presence of antibodies to hiv the virus that causes

  What the regression effect says

He argues that the regression effort says people who perform very poorly in their previous job tend to perform well in their next job. Is this what the regression effect says? Explain.

  Hypergeometric distribution

Suppose that in a population of 10 items, 3 are defective and 7 are not. Suppose that two items are chosen at random for inspection. Let X be the number of defective items inspected. report all probabilities to a minimum of 5 decimal places of acc..

  In a survey 427 different women are randomly selected

in a survey 427 different women are randomly selected without replacement and each woman is asked what she purchased

  Determining sample size requirements

Determine sample size requirements. When the results are available you would like your margin of error to be plus or minus 5% for the satisfaction measure and plus or minus $1.50 for the spending value.

  In the past the mean processing time was 45daysat the 005

question during a period of one month a random sample of 27 approved life insurance policies is selected and the

  Sampling distribution and the central limit theorem

The Central Limit Theorem states that the sampling distribution of means is:

  If a company being audited fails this test what is the

an accounting firm has noticed that of the companies it audits 85 show no inventory shortages 10 show small inventory

  Using of anova instead of t-tests

Describe the circumstances under which you should use ANOVA instead of t tests, and explain why t tests are inappropriate in these circumstances. Find a peer-reviewed article that reflects these circumstances, describe the research conducted (i.e...

  Relative frequency probability

What is the probability that a particular driver has two speeding violations.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd