Develop a cart model using the test data set

Assignment Help Other Subject
Reference no: EM132993597

This assignment begins our study of predictive modeling techniques, starting with two variants of decision trees, CARTs and Random Forests. You will first work through the mechanical aspects of programming these methods, then in Part 2 you will examine some of the theoretical and mathematical bases behind them.

Part 1: Operational Tasks

For this assignment, work with the loans_training and loans_test data sets. Use Python to solve each problem.
To demonstrate completion of this assignment, create a Word document with your working code, screenshots of program results, and written answers to questions. Upload your final Jupyter notebook and Word document to the LMS when complete.

1. Create a CART model using the training data set that predicts Approval using Debt to Income Ratio, FICO Score, and Request Amount. Visualize the decision tree. Describe the first few splits in the decision tree.

2. Develop a CART model using the test data set that uses the same target and predictor variables. Visualize the decision tree. Investigate the splits in the decision tree. Does the tree built using the test data match the tree built using the training data?

3. Build a C5.0 model using the training data set that predicts Approval using Debt to Income Ratio, FICO Score, and Request Amount. Specify a minimum of 1,000 cases per terminal node. Visualize the decision tree. Describe the first few splits in the decision tree.

4. How does your C5.0 model compare to your CART model for the loans_training data? Describe the similarities and differences.

5. Create a C5.0 model using the test data set that utilizes the same target variable, predictor variables, and minimum cases criterion. Visualize the decision tree. Does the tree built using the test data match the tree built using the training data?

6. Use random forests on the training data set to obtain the predicted value of Approval using the same predictor variables as in the CART and C5.0 models.

7. Use random forests on the test data set to obtain the predicted value of Approval in the test data set. Build a table comparing the predictions from the training and test data sets. How do they compare?

Part 2: Mathematical and Statistical Basis

1. Read Delen et al. (2013). Explain how the authors used principal component analysis (PCA) to decompose their data into linear components, similar to discriminant function analysis in MANOVA. How were the factors, eigenvectors, and the covariance matrix determined? What role did the squared error distance play in their conclusions?

2. Continuing with Delen et al. (2013), what role did the Chi-squared automatic interaction detector (CHAID) play in their decision tree algorithm? How does the CHAID technique compare to C5.0, CART (referred to as C&RT by the authors), and the quick, unbiased, efficient statistical tree (QUEST). How do the accuracies of these techniques compare, and how were they measured by the authors?

3. Read Meena et al. (2019). Critique the authors' implantation of the CART algorithm, including their association rules and the set-theoretic justifications they present in Sections 4.4, 4.5, 5, and 6. Do these mathematical bases properly support their computational results presented in section 7? Why or why not?

Include references to all theoretical concepts and works cited. Show all your steps with explanations. Explain major components of complex solutions, code, and any output. Include captions to tables, images, and diagrams. Use formal and detailed mathematical and scientific notation throughout the document

While APA style is not required for the body of this assignment, solid academic writing is expected, and documentation of sources should be presented using APA formatting guidelines, which can be found in the APA Style Guide, located in the Student Success Center.

Attachment:- Operational Task.rar

Reference no: EM132993597

Questions Cloud

How many years would need to invest a sum of money : How many years would you need to invest a sum of money at 10% p.a. in order for it to double in value with interest paid quarterly?
Find the future value of the annuity payments : Assuming you kept your money in an account earning 12% interest per annum. Find the future value of this annuity payments at the end of 20 years.
What is authentic leadership theory : What is authentic leadership theory? Relationship between Nadella's leadership style and authentic leadership theory. Include intext theory
Compute the future value of the annuity payments : You invest P12,000 every year (every end of each year) for the next 20 years, at 12% interest. Compute the future value of this annuity payments.
Develop a cart model using the test data set : Develop a CART model using the test data set that uses the same target and predictor variables. Visualize the decision tree. Investigate the splits
How much would that investment worth today : Suppose $1 was invested in 60 years ago at 3.6% interest compounded yearly, approximate how much would that investment worth today?
Can richard afford one of the sports cars : The annual premium, of $600 was paid 4 months ago. Richard can take out a loan of no more than $10,000. Can Richard afford one of the sports cars?
What is the breakeven point for the number of sports cars : A new machine costing $10,000 can reduce production cost by $1,000 per car. What effect will this have on the profit in first year if 100 sports cars are sold?
Critically reflecting on project progress : Critically reflecting on project progress and identifying contingency strategies if required; communicating project findings within the workplace and presenting

Reviews

len2993597

9/20/2021 4:25:50 AM

Can you look at the attached document and see if you can provide me with a solution on the Jupyter notebook. The attached contains the document and the data that is needed to answer the question. The class test book is too large to attach, but the book is Data science using python and r. by Larose, C. D., Larose, D. T., & Larose, Chantal D., Author. (2019).

Write a Review

Other Subject Questions & Answers

  Cross-cultural opportunities and conflicts in canada

Short Paper on Cross-cultural Opportunities and Conflicts in Canada.

  Sociology theory questions

Sociology are very fundamental in nature. Role strain and role constraint speak about the duties and responsibilities of the roles of people in society or in a group. A short theory about Darwin and Moths is also answered.

  A book review on unfaithful angels

This review will help the reader understand the social work profession through different concepts giving the glimpse of why the social work profession might have drifted away from its original purpose of serving the poor.

  Disorder paper: schizophrenia

Schizophrenia does not really have just one single cause. It is a possibility that this disorder could be inherited but not all doctors are sure.

  Individual assignment: two models handout and rubric

Individual Assignment : Two Models Handout and Rubric,    This paper will allow you to understand and evaluate two vastly different organizational models and to effectively communicate their differences.

  Developing strategic intent for toyota

The following report includes the description about the organization, its strategies, industry analysis in which it operates and its position in the industry.

  Gasoline powered passenger vehicles

In this study, we examine how gasoline price volatility and income of the consumers impacts consumer's demand for gasoline.

  An aspect of poverty in canada

Economics thesis undergrad 4th year paper to write. it should be about 22 pages in length, literature review, economic analysis and then data or cost benefit analysis.

  Ngn customer satisfaction qos indicator for 3g services

The paper aims to highlight the global trends in countries and regions where 3G has already been introduced and propose an implementation plan to the telecom operators of developing countries.

  Prepare a power point presentation

Prepare the power point presentation for the case: Santa Fe Independent School District

  Information literacy is important in this environment

Information literacy is critically important in this contemporary environment

  Associative property of multiplication

Write a definition for associative property of multiplication.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd