Develop a cart model using the test data set

Assignment Help Other Subject
Reference no: EM132993597

This assignment begins our study of predictive modeling techniques, starting with two variants of decision trees, CARTs and Random Forests. You will first work through the mechanical aspects of programming these methods, then in Part 2 you will examine some of the theoretical and mathematical bases behind them.

Part 1: Operational Tasks

For this assignment, work with the loans_training and loans_test data sets. Use Python to solve each problem.
To demonstrate completion of this assignment, create a Word document with your working code, screenshots of program results, and written answers to questions. Upload your final Jupyter notebook and Word document to the LMS when complete.

1. Create a CART model using the training data set that predicts Approval using Debt to Income Ratio, FICO Score, and Request Amount. Visualize the decision tree. Describe the first few splits in the decision tree.

2. Develop a CART model using the test data set that uses the same target and predictor variables. Visualize the decision tree. Investigate the splits in the decision tree. Does the tree built using the test data match the tree built using the training data?

3. Build a C5.0 model using the training data set that predicts Approval using Debt to Income Ratio, FICO Score, and Request Amount. Specify a minimum of 1,000 cases per terminal node. Visualize the decision tree. Describe the first few splits in the decision tree.

4. How does your C5.0 model compare to your CART model for the loans_training data? Describe the similarities and differences.

5. Create a C5.0 model using the test data set that utilizes the same target variable, predictor variables, and minimum cases criterion. Visualize the decision tree. Does the tree built using the test data match the tree built using the training data?

6. Use random forests on the training data set to obtain the predicted value of Approval using the same predictor variables as in the CART and C5.0 models.

7. Use random forests on the test data set to obtain the predicted value of Approval in the test data set. Build a table comparing the predictions from the training and test data sets. How do they compare?

Part 2: Mathematical and Statistical Basis

1. Read Delen et al. (2013). Explain how the authors used principal component analysis (PCA) to decompose their data into linear components, similar to discriminant function analysis in MANOVA. How were the factors, eigenvectors, and the covariance matrix determined? What role did the squared error distance play in their conclusions?

2. Continuing with Delen et al. (2013), what role did the Chi-squared automatic interaction detector (CHAID) play in their decision tree algorithm? How does the CHAID technique compare to C5.0, CART (referred to as C&RT by the authors), and the quick, unbiased, efficient statistical tree (QUEST). How do the accuracies of these techniques compare, and how were they measured by the authors?

3. Read Meena et al. (2019). Critique the authors' implantation of the CART algorithm, including their association rules and the set-theoretic justifications they present in Sections 4.4, 4.5, 5, and 6. Do these mathematical bases properly support their computational results presented in section 7? Why or why not?

Include references to all theoretical concepts and works cited. Show all your steps with explanations. Explain major components of complex solutions, code, and any output. Include captions to tables, images, and diagrams. Use formal and detailed mathematical and scientific notation throughout the document

While APA style is not required for the body of this assignment, solid academic writing is expected, and documentation of sources should be presented using APA formatting guidelines, which can be found in the APA Style Guide, located in the Student Success Center.

Attachment:- Operational Task.rar

Reference no: EM132993597

Questions Cloud

How many years would need to invest a sum of money : How many years would you need to invest a sum of money at 10% p.a. in order for it to double in value with interest paid quarterly?
Find the future value of the annuity payments : Assuming you kept your money in an account earning 12% interest per annum. Find the future value of this annuity payments at the end of 20 years.
What is authentic leadership theory : What is authentic leadership theory? Relationship between Nadella's leadership style and authentic leadership theory. Include intext theory
Compute the future value of the annuity payments : You invest P12,000 every year (every end of each year) for the next 20 years, at 12% interest. Compute the future value of this annuity payments.
Develop a cart model using the test data set : Develop a CART model using the test data set that uses the same target and predictor variables. Visualize the decision tree. Investigate the splits
How much would that investment worth today : Suppose $1 was invested in 60 years ago at 3.6% interest compounded yearly, approximate how much would that investment worth today?
Can richard afford one of the sports cars : The annual premium, of $600 was paid 4 months ago. Richard can take out a loan of no more than $10,000. Can Richard afford one of the sports cars?
What is the breakeven point for the number of sports cars : A new machine costing $10,000 can reduce production cost by $1,000 per car. What effect will this have on the profit in first year if 100 sports cars are sold?
Critically reflecting on project progress : Critically reflecting on project progress and identifying contingency strategies if required; communicating project findings within the workplace and presenting

Reviews

len2993597

9/20/2021 4:25:50 AM

Can you look at the attached document and see if you can provide me with a solution on the Jupyter notebook. The attached contains the document and the data that is needed to answer the question. The class test book is too large to attach, but the book is Data science using python and r. by Larose, C. D., Larose, D. T., & Larose, Chantal D., Author. (2019).

Write a Review

Other Subject Questions & Answers

  Self-knowledge theory vs. epicurean happiness-virtue theory

How do these relate? Which theory is more "sound". Please write any thoughts you have on these two. I'm writing a paper and I'm looking for insight on these two. A fresh mind would be very helpful.

  Imagine you are the corporate financial officer

Imagine you are the Corporate Financial Officer (CFO) of a Fortune© 500 company.

  Describe in detail the perma formula

Please describe in detail the PERMA formula. Which component do you feel is most important? Please describe why. Your response must be at least 500 words in length

  Discuss plato-aristotle or the cartesian method

Identify a belief that you (or someone in your community) think is true. Present an account of at least one metaphysical account of reality from the assigned readings with support from the course texts and online lectures. For example, you might d..

  Describe the methods used by organized crime groups

Create at least one chart or graph that supports your opinion with some facts and figures you found from your research on drug-related crimes.

  How your design addresses the values of the organization

How your design addresses the values of the organization and how the new logo would be perceived by the employees, customers, and shareholders.

  Uderstanding of james and russell philosophies

Based on your understanding of James's and Russell's philosophies, explain what James means by ‘cash value' of truth (true beliefs), and answer the following question:

  What are some of the risks involved in doing so

Psychatric illness - Discuss how you think services in your area would be able to help them. What are some of the risks involved in doing so

  However members of particular group tend to score low on

consider the following scenario a private school utilizes a measure with well-established predictive validity for

  How you would intervene and de-escalate joel

How you would intervene and de-escalate Joel when he has an outburst. How you would collaborate with the co-teacher to intervene and debrief the other students.

  Post a description of the types of probability

Post a description of the types of probability and nonprobability sampling you selected. Then describe two strengths and two weaknesses.

  Why scholarly sources should be used to support your writing

For this discussion you will address the following prompts: Explain why scholarly sources should be used to support your writing on the selected topic.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd