How Cross Validation tests model derived from training data

Assignment Help Computer Engineering
Reference no: EM132242428

Data Mining Assignment -

Suppose that the following table of instances (cases) were recorded for an insurance company's promotions for its life assurance product. The attributes are self-explanatory, and the values in the two product promotion attributes should be read as follows: a Yes means that the individual was offered that particular promotion only if s/he would take out the insurance and No not offered the promotion.

ID

Income Range

Gender

Age Range

Holiday Promotion

Wine Promotion

Life Insurance Take-up

1

40-50K

Male

30-40

No

Yes

Yes

2

30-40K

Female

30-40

No

Yes

No

3

40-50K

Male

30-40

No

No

No

4

30-40K

Male

30-40

Yes

Yes

Yes

5

50-60K

Female

20-30

No

No

No

6

20-30K

Female

40-50

No

No

No

7

30-40K

Male

20-30

Yes

No

No

8

20-30K

Male

20-30

No

Yes

Yes

9

30-40K

Male

30-40

No

Yes

Yes

10

30-40K

Female

30-40

No

No

Yes

11

40-50K

Female

30-40

No

No

No

12

20-30K

Male

20-30

No

Yes

Yes

13

50-60K

Female

20-30

No

No

No

14

40-50K

Male

40-50

No

Yes

No

15

20-30K

Female

20-30

Yes

Yes

No

16

40-50K

Female

30-40

No

No

No

17

50-60K

Male

40-50

Yes

Yes

Yes

18

20-30K

Female

30-40

No

Yes

No

19

20-30K

Male

40-50

Yes

Yes

Yes

20

30-40K

Female

20-30

Yes

Yes

No

Questions -

1. Use the ID3 decision tree induction method available in the Weka package (with the default setting) to derive a classifier (decision tree) from this set of data. The class attribute is Life Assurance Take-up.

2. What should be the class value for the following unseen case based on the derived tree? Justify your answer.

Income Range

Gender

Age Range

Holiday Promotion

Wine Promotion

Life Insurance Take-up

40-50K

Male

20-30

No

Yes

?

How would you deal with such cases in general? Outline your solution algorithmically using the structure given below:

algorithm DT-based Classification

# traversing the tree to reach a leaf node N

if N's class value is null then

:

: write your pseudo code to implement your solution here

:

else

return the class value

end

3. A decision tree derived from data can be used not only to predict class values for unseen cases, but also to summarize data for analysis. Based on the tree derived in 1), comment on whether the company has conducted its promotion effectively.

4. In the default setting in Weka, there is a setting of "Cross-Validation Folds 10" in the test options. Briefly explain how Cross Validation tests a model derived from training data and why we use it for testing.

5. Now perform the following tests: you vary "fold" from 2 to 10, run ID3 and observe classification accuracy for each setting. You then change the test options setting to "Use training set" and run ID3 and observe classification accuracy. You can record and present these test results as a table or a bar chart. Comment on your test results: which method (cross validation or using training set) is better for testing your derived tree and why?

6. Use the JRip rule induction method available in the Weka package (with the default setting) to derive a classifier (classification rules) from this set of data.

7. What observations do you have on the two classifiers you have obtained in terms of using them for business analysis (as in 3) and for classification of an unseen case (as in 2)?

Attachment:- Assignment Files.rar

Reference no: EM132242428

Questions Cloud

Mission statement is leadership and or managerial tool : A mission statement is a leadership and or a managerial tool which gives the ability to direct the behavior in a company (Campbell, 1993).
How lean staffing models could positively contribute : The purpose of this assignment is to consider factors that impact efficient staffing models and work practices within health care organizations.
What revised workflow would you implement : Demonstrate how the change will affect current workflows. What revised workflow would you implement? Identify resources (human, time, material, etc.).
Anticipating the future of the pharmaceutical industry : Peter Johnson found himself looking forward to the Senior Management Forum that he was scheduled to moderate at the end of the month.
How Cross Validation tests model derived from training data : Data Mining Assignment - Briefly explain how Cross Validation tests a model derived from training data and why we use it for testing
What is the value of social capital : What Is the Value of Social Capital?
Identify your role as the community health nurse : List strategies for a specified population that would promote empowerment in your own community. Identify your role as the community health nurse.
Provided the appropriate return on investment : Training evaluations are important to determine if the selected training solution was effective and provided the appropriate return on investment
Develop the lr equation and chart : Your written (in Word) analysis should discuss the logic and rationale used to develop the LR equation and chart.

Reviews

len2242428

2/25/2019 9:37:09 PM

Criteria for assessment - Credit will be awarded against the following criteria. The classifier derived using ID3 for Q1 [5 marks] Convincing arguments and solution for Q2 [25 marks] Valid analysis for Q3 [15 marks] Clarity of explanation Q4 [20 marks] Experiment results and analysis for Q5 [10 marks] The classifier derived using JRip for Q6 [5 marks] Clarity of your observations for Q7 [20 marks].

Write a Review

Computer Engineering Questions & Answers

  Describe parallel architecture that uses pipeline processing

Describe a parallel architecture that uses pipeline processing. How does a shared-memory parallel configuration work?

  Define tools of the customer interface

Define tools of the customer interface and to distinguish the different stages of developing a website by preparing a presentation.

  Draw a bus structure to perform the operations

Draw a bus structure to perform the operations in problem. A two-bit counter C controls the register transfers shown below.

  Fixing errors in software to control the security

While reading the code top-down, we always try to use our expectations regarding the application domain in order to predict what major functional elements of the code will be.

  Compute and plot 4 mean temperature profiles

Compute and plot 4 mean temperature profiles (temperatures a function of the time of the day) for each of four-periods of one year in a single sub-figure (January-March; April-June; July-September; October-December).

  How to draw an erd and tutorial

How to draw an ERD and tutorial" and you will come up with over " 3,380,000" hits. Usually the first four or five hits will be the best ones.

  How much work you can do per cycle

How fast you can crank up the clock and how much work you can do per cycle

  Write the identity for the opt value

Describe data structure you will use to store OPT value for the subproblems and the order in which you will fill out the entries in your data structure.

  How far should you be willing to push the ethical

will you personally and/or your company be criminally liable if you did.

  What type of security mechanism are provided

What type of security mechanism(s) are provided when a person signs a form he has filled out to apply for a credit card?

  How many have to be recycled

The bolts have a mean diameter of 1.000 cm, normally distributed with standard deviation of 0.010 cm - a bolt and a hole are randomly selected from supplies of parts with these characteristics.

  A program that converts knuts to sickles and galleons

Write a C program that converts knuts to sickles and galleons (the currency of the HarryPotter novels).The user will enter the total number of knuts.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd