Reference no: EM131164236
Unless otherwise stated, you can use R for any of the calculations, but make sure you include your code. Your code should not be a copy of anyone else's! Any code you turn in should be well organized and commented so the grader can understand your answers.
All programming questions should be submitted to the dropbox on ANGEL for this assignment as a .pdf file using the naming convention HWNum_FirstInitialLastName.pdf. For example, John Doe would submit a file titled HW1_JDoe.pdf for the first assignment. Your answer to programming questions should include both code and a description of your result. I recommend using R-markdown for writing up your answers. A template for writing up an assignment in R-markdown can be found on ANGEL. R-markdown files can be compiled directly within R-Studio. Alternatively, answers may be saved in a word document or LaTeX, and converted into a .pdf file.
Non-coding questions can either be written and submitted in the same file as your coding questions using LaTeX typesetting (see https://latex-project.org/intro.html) or they may be handwritten and turned in separately during class.
1. Load the "wine.Rdata" dataset. This dataset contains the wine chemical and physical attributes for 1,599 red wines as well as a quality assessment (quality = 1: good, 0 = poor).
(a) Fit a logistic regression model with wine quality taken as the response, and the remaining variables as covariates.
(b) Clearly state the statistical model definition for this logistic regression model. Include any relevant assumptions.
(c) Interpret the model coefficients. What does this output indicate about the marginal association between each covariate and the mean response?
(d) Predict the probability that the response is of high quality under the following covariate settings.
(e) Use the predict() function to get predicted probabilities from the fitted model.
Using the following decision rule, transform the predicted probabilities into predicted response values. Create a confusion matrix for these predictions (i.e. true positives, false positives, true negatives, false negatives).
2. The board of directors of a professional association conducted a random sample survey of 30 members to assess the effects of several possible amounts of changes in membership dues. The predictor X denotes, in dollars, the change in annual dues from the previous year posited in the survey interview, and the response is binary: Y = 1 if the interviewee indicated that the membership will NOT be renewed at that amount of change in dues and Y = 0 if the membership will be renewed. The output for fitting the logistic regression model is given below. Use this to answer the following questions.
(a) Write the estimated equation as a function of X for i. The log-odds of not renewing a membership
ii. The odds of not renewing a membership
iii. The probability of not renewing a membership
(b) Estimate the probability that someone does NOT renew their membership when the annual dues increase by $32.
(c) Estimate the odds that someone does NOT renew their membership when the annual dues increase by $32
(d) Estimate the probability and odds that someone DOES renew their membership when the annual dues increase by $5
(e) Find the odds ratio of renewal for a scenario where the annual increase is $5 against one where the annual increase is $10.
(f) Estimate the increase in annual dues for which 75% of the members are expected to not renew their membership.
(g) Conduct a Wald test to determine whether dollar increase in dues is related to the probability of membership renewal. In your answer, state the null and alternative hypotheses, the test statistic, p-value, and conclusion based on (1) α = 0.05 and (2) α = 0.1.