Reference no: EM131109300
Decision Tree
One major issue for any decision tree algorithm is how to choose an attribute based on which the data set can be categorized and a well-balanced tree can be created. The most traditional approach is called the ID3 algorithm proposed by Quinlan in 1986. The detailed ID3 algorithm is shown in the slides. The textbook provides some discussions on the algorithm in Section 18.3. For this problem please follow the ID3 algorithm and manually calculate the values based on a data set similar to (but not the same as) the one in the slides (p. 147). This exercise should help you get deep insights
on the execution of the ID3 algorithm. Please note that concepts discussed here (for example, entropy, information gain) are very important in information theory and signal processing fields. The new data set is shown as follows. In this example row 10
is removed from the original set and all other rows remain the same.
Following the conventions used in the slides, please show a manual process and calculate the following values: Entropy(S), Entropy(S weather = sunny ) ,
Entropy(S weather = windy ) , Entropy(S weather = rainy ) , Gain (S, weather), Gain (S, parents) and
Gain (S, money). Based on the last three values, which attribute should be chosen to split on?
Please show detailed process how you obtain the solutions.
Weekend
|
Weather
|
Parents
|
Money
|
Decision
(Category)
|
W1
|
Sunny
|
Yes
|
Rich
|
Cinema
|
W2
|
Sunny
|
No
|
Rich
|
Tennis
|
W3
|
Windy
|
Yes
|
Rich
|
Cinema
|
W4
|
Rainy
|
Yes
|
Poor
|
Cinema
|
W5
|
Rainy
|
No
|
Rich
|
Stay in
|
W6
|
Rainy
|
Yes
|
Poor
|
Cinema
|
W7
|
Windy
|
No
|
Poor
|
Cinema
|
W8
|
Windy
|
No
|
Rich
|
Shopping
|
W9
|
Windy
|
Yes
|
Rich
|
Cinema
|
Besler corporation had a projected benefit obligation
: At December 31, 2010, Besler Corporation had a projected benefit obligation of $560,000, plan assets of $322,000, and prior service cost of $127,000 in accumulated other comprehensive income.
|
Describe the three-level architecture of dbms
: Describe the three-level architecture of DBMS?
|
Calculate the firm''s cash conversion cycle
: Calculate the firm's cash conversion cycle, its daily cash operating expenditure, and the amount of resources needed to support its cash conversion cycle.
|
What is molecular formula
: A compound is found to contain 49.5% carbon, 5.19% hydrogen, 16.5% oxygen, and 28.9% nitrogen. Its molecular mass is 194.2 g/mol. What is its empirical formula? What is its molecular formula? Explain what each of these formulas tells us about the ..
|
Decision tree
: One major issue for any decision tree algorithm is how to choose an attribute based on which the data set can be categorized and a well-balanced tree can be created. The most traditional approach is called the ID3 algorithm proposed by Quinlan in 198..
|
Calculate the firm''s operating cycle and cash conversion
: Calculate the firm's operating cycle and cash conversion cycle. Calculate the firm's daily cash operating expenditure. How much in resources must be invested to support its cash conversion cycle?
|
How long should this information be kept
: If this information could be used to help you establish an alibi, would you want the cell phone company to be able to release it to the police?
|
Question regarding the percent yield
: How do the following influence the percent yield? Begin by stating which data item would be in error and explain whether the percent yield would be too large, too small, or not affected at all.
|
Mancuso corporation amended its pension plan
: Mancuso Corporation amended its pension plan on January 1, 2010, and granted $160,000 of prior service costs to its employees.
|