Perform logistic regression on the three data subsets

Assignment Help Computer Engineering
Reference no: EM131461228

Assignment: Logistic regression (R Studio)

This is the question

Investigate the use of Logistic Regression on a subset of the Kaggle Credit Card Fraud Data set (www.kaggle.com/dalpozz/creditcardfraud). Note that in this data set, the number of fraud data are much smaller than the normal data.

Your first task would be to construct subset data set(s) from the Kaggle data set. Construct three subset data sets of 100K, 20K, and 10K, with normal and fraud data included (make sure you maximize the number of fraud data elements). Out of this data set construct a training data set and a testing data set (using 80% of the data for the former, and 20% for the latter) to build and test the logistic regression model.

Tasks:

1. Perform Logistic Regression on the three data subsets (100K, 20K, 10K). Show your results using a cross-table. Discuss your results for each of the data sets.

2. Perform Ridge Logistic Regression and Lasso Logistic Regression on the three data subsets. Hint: https://ricardoscr.github.io/how-to-use-ridge-and-lasso-in-r.html (Links to an external site.). Show your results using a cross-table and discuss the results in comparison to (1) .

Text Book: An Introduction to Statistical Learning with Applications in R by Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani.

Reference no: EM131461228

Questions Cloud

Watch the video - sharing the childs day with the parent : These parents say that their children seem to be playing a lot more than learning. Imagine the early childhood professionals interviewed.
Explain the quantitative and qualitative tools : For this assignment select either your own organization or an organization about which you know enough to review the supply chain processes.
Differences between sparta and athens : Characterize the major differences between Sparta and Athens, including your view of the strengths and weaknesses of each.
South in the united states : Was slavery, and the racism and mistreatment of black people that accompanied it, something restricted to the South in the United States before the Civil War?
Perform logistic regression on the three data subsets : Perform Logistic Regression on the three data subsets (100K, 20K, 10K). Show your results using a cross-table. Discuss your results for each of the data sets.
Describe the social program that is being offered : What is an appropriate research question that your evaluation should seek to answer or a hypothesis that you would test?
What is the source of the information : What is the source of the information? Is it relevant to your topic? Who published it? Is it peer-reviewed? Is there a bias?
How many aircraft of each type should be scheduled : An airline company has three types of aircraft that carry three types of cargo. The payload of eachtype is summarised in the table below:Plane typeUnits carried
Concerning the present value : concerning the present value of $1.00 five years from today discounted at 5%?

Reviews

Write a Review

Computer Engineering Questions & Answers

  Mathematics in computing

Binary search tree, and postorder and preorder traversal Determine the shortest path in Graph

  Ict governance

ICT is defined as the term of Information and communication technologies, it is diverse set of technical tools and resources used by the government agencies to communicate and produce, circulate, store, and manage all information.

  Implementation of memory management

Assignment covers the following eight topics and explore the implementation of memory management, processes and threads.

  Realize business and organizational data storage

Realize business and organizational data storage and fast access times are much more important than they have ever been. Compare and contrast magnetic tapes, magnetic disks, optical discs

  What is the protocol overhead

What are the advantages of using a compiled language over an interpreted one? Under what circumstances would you select to use an interpreted language?

  Implementation of memory management

Paper describes about memory management. How memory is used in executing programs and its critical support for applications.

  Define open and closed loop control systems

Define open and closed loop cotrol systems.Explain difference between time varying and time invariant control system wth suitable example.

  Prepare a proposal to deploy windows server

Prepare a proposal to deploy Windows Server onto an existing network based on the provided scenario.

  Security policy document project

Analyze security requirements and develop a security policy

  Write a procedure that produces independent stack objects

Write a procedure (make-stack) that produces independent stack objects, using a message-passing style, e.g.

  Define a suitable functional unit

Define a suitable functional unit for a comparative study between two different types of paint.

  Calculate yield to maturity and bond prices

Calculate yield to maturity (YTM) and bond prices

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd