Build a logistic regression model that can predict

Assignment Help Other Subject
Reference no: EM133130703

Assessment Description

It is important to have experience implementing one of the most common applications of regression currently used in business, finance, and healthcare. Questions like should a loan be approved, is a driver entitled a discount, and will a patient survive are all answered with a form of logistic regression (i.e., with a Yes/No answer).

Using a dataset representing applications for a bank loan, the task will be to build a logistic regression model that can predict whether or not a loan will be approved.

Useful R functions for this assignment are:
1. Data explorations: na(), summary()
2. Split data into train/test: sample()
3. Build the model: glm(), summary()
4. Model performance evaluation: predict()
5. Model validation: library(gains), gains(), plot(), lines(), dim(), library(car), vif(), glm(), summary(), predict(), ifelse()
6. Validate prediction: table(), mean()
7. Results interpretation: library(ROCR), predict(), prediction(), performance()

For this activity, perform the following:
Load the "application_record.csv," located in the topic Resources,(Credit Card Approval Prediction | Kaggle) into a data frame and perform initial exploratory tasks:
1. Display representative portions of the data.
2. Check for missing values and clean the data.
3. Check for outliers and decide if and how to process them.
Formally state what your model will predict using the variables in the data.
Split the data into a training set and a testing set with a split ratio of 70:30.

Build the Predictive Model:
1. Define the formula for the glm().
2. Run the model.
3. Interpret the results, referring to the p-values.

Evaluate the Model Performance:
1. Compare the predicted versus actual values.
2. Search for any predictions that differ significantly from the actual values.

Validate the Model:
1. Produce a Gain and Lift chart and use it to describe the performance of the model.
2. Measure the Variation Inflation Factor (VIF) to test for multicollinearity. If changes are necessary to the model based in VIF, state and implement them.
3. Has the formula, as defined in the previous section, changed? Why or why not?
4. If changes to the model occurred, repeat the validation steps on the new model.

Make Predictions:
1. Demonstrate a few examples of predictions your model can make.
2. Validate the predictions by calculating the misclassification error.
3. Interpret the results.
State a few suggestions for improving the model.

Submit a professionally written and formatted R Markdown document knitted as a PDF. Make sure the documentation contains the R code, relevant plots, your analysis, and the appropriate citations and references.

While APA style is not required for the body of this assignment, solid academic writing is expected, and documentation of sources.

Attachment:- Assessment_Description.rar

Reference no: EM133130703

Questions Cloud

What is the equilibrium price if demand is reduced by half : 1. What is the equilibrium price if demand is reduced by half?
Develop a graph that forecasts profit : Develop a graph that forecasts profit for the next five years. Please note that Excel is a great tool for creating a graph for this project
Compute Takata Company accounts receivable turnover : Question - In 2021, Takata Company has net credit sales of $1207000 for the year. Compute Takata Company's accounts receivable turnover
Provide the journal entry : Provide the journal entry if Rogers sold 1,000 shares of treasury stock for $25 per share. The treasury stock was repurchased at $20 per share.
Build a logistic regression model that can predict : Compare the predicted versus actual values - Search for any predictions that differ significantly from the actual values - Produce a Gain and Lift chart
Innovative sanitation product internationally : Executive management team which is exploring the possibility of producing, marketing and distributing their new, innovative sanitation product internationally
Create the adjusted net income : The bonds have a face value of P250,000 and pay a stated interest rate of 6%. Create the adjusted 2019 net income
Business logistics-supply chain management process : Transportation plays a vital role in the business logistics/supply chain management process.
Determine the total cost of the Snow Man project : The cost of direct materials on the job was $|9,000 and the direct labor rate is $30 per hour. Determine the total cost of the Snow Man project

Reviews

Write a Review

Other Subject Questions & Answers

  Influence information processing

How might practice or a learning environment most effectively be structured to improve information processing? Why?

  What would be the benefits consequences turing red flag law

What would be the benefits and negative consequences of a "Turing Red Flag" law for society? Think critically with your answer.

  What are the implications associated with the alterations

Why do you see increases and decreases in the invasive species population? What are the implications associated with these alterations to the ecosystem as a whole?

  Discuss specific situations when you would use cidr

Discuss specific situations when you would use CIDR, summarization, and VLSM. Each method involves moving the network/node line to the right or to the left.

  Describe how reconstructing a crime scene can aid

Describe how reconstructing a crime scene can aid in understanding the fundamentals of criminal investigations.

  Populations served in the criminal justice profession

Identify and describe 3 needs of the various populations served in the criminal justice profession

  Differences between mid-latitude and tropical cyclones

Explain the main differences between mid-latitude cyclones and tropical cyclones. This should be at least six sentences long. Be as detailed as possible.

  Describe at least one piece of advice that the source states

Describe at least one piece of advice that this source states. What is your opinion about the advice? Do you believe it? Why or why not?

  Define concepts of health and illness using e-lab sources

Define concepts of Health and Illness using E-lab sources from different authors. Come prepared for class discussion to participate actively.

  Identify the phenomenon you would measure

The center point of research studies is the body of data collected to answer the research question. These data must be measured, which is the act of taking.

  Write a essay on the value of good time-management skills

In a minimum of 250 words, write a three-paragraph essay on the value of good time-management skills. Use at least one tip from this week's Read section.

  Define dementia of alzheimer type

Define dementia of Alzheimer type (DAT) and describe the pathophysiology, clinical manifestations, evaluation and treatment. Use at least one scholarly source.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd