Reference no: EM132451686
Assignment -
This week we'll be learning how to build a regression model with a classification (yes/no, 1/0) variable as our dependent variable. We will use a technique called "logistic regression".
Building models in logistic regression is similar to linear regression. But deciding on a best model and interpreting the parameter estimates is quite different.
As usual, you have a homework case. This week, we're back to Holmes University, working on a freshmen retention task force. We will build a model to predict whether students will return for their sophomore year, and consider how to use the model to decide which students will receive a costly intervention.
We've collected the following variables:
GPA: The student's GPA in their freshman year
Athlete: =1 if the student is an athlete, =0 otherwise
Miles from home: Distance from campus to the student's home
College: College in which the student is enrolled: Education, Business, or Arts and Sciences
Accommodations: Home or Dorm
Work Hours: The number of hours the student said they worked at a job during the last week. They could either answer 0, 0-5, 5-10, 10-15, 15-20, or 20+; this has been coded with the midpoint of that range, or 22.5 for 20+. Not perfect, but it's the best we have.
Attends office hours: How often does the student say they go to office hours: Never, Sometimes, or Regularly
HS GPA: The student's high school GPA
Return: Dependent variable; =1 if the student returned, =0 if the student did not return.
Your sample includes 500 students; of those, 395 return, and 105 do not.
Tasks -
1. Build a logistic regression model to predict which students will leave/return to Holmes University for their sophomore year.
1. In addition to the variables given, consider polynomial and cross-product terms.
2. Particularly, it looks like GPA, College, and Miles from home are important variables; a polynomial or cross-product involving those variables is useful.
Interpret the parameter estimates in your model, including numerical effects or graphical display of effects.
1. What generally makes students more or less likely to leave Holmes University?
The retention task force plans to use your model to identify students who are likely to leave. It will place them in a program where they get access to additional services and possibly a small financial incentive to return. The cost of this program is $1,000 per student you identify as likely to leave. Every student who you correctly identify as likely to leave will now be more likely to return: correctly identifying a student as likely to leave gains $4,000 per student.
1. What cutoff probability should you use to identify likely leavers?
2. How much net benefit will this program give the university, based on the 500 students in your sample?
Write your findings in a case report as usual.
Attachment:- Report Template.rar