Reference no: EM132697129
QUESTION 1
1. In a decision tree what is the difference between a node and a leaf?
2. A decision node is a decision point where a variable is tested, while a leaf node specify the label for the classification.
3. A decision node is a decision point, with the exception of the root node, where a variable is tested, while a leaf node specify the label for the classification.
4. Both nodes can contain either a test for a variable or a label for the classification output.
QUESTION 2
1. Can you apply Decision Trees to a numeric variable?
2. Yes, a decision tree can handle numerical targets as long as they are continuous.
3. In general you cannot use numerical targets in a decision tree model. However, you can bin your numeric variables so to create "categories" and then apply the decision tree model.
4. No, you cannot use numerical targets in a decision tree model.
QUESTION 3
1. Given a dataset, is the decison tree associated to the dataset unique?
2. Yes, the decision tree generated from the same dataset is necesserely unique.
3. No, you can have different decision trees generated from the same dataset, but they are always binary.
4. No, you can have different decision trees generated from the same dataset with different "shapes".
QUESTION 4
1. Why does CART try to generate leaf nodes that are as pure as possible?
2. Even though in a pure leaf node the confidence is reduced, the ability of generalization is improved.
3. Having leaf nodes that are as pure as possible reduces the classification error.
4. A pure leaf node creates a balanced branch, which 50% of the nodes on the left size and 50% on the right.
QUESTION 5
1. Why CART might create a node that is not pure?
2. This cannot happen because the split function only generates pure nodes.
3. This cannot happen by definition of the split function.
4. This happens when all the variables in that particular set have the same values, but the target still has several different outcomes. In this situation the algorithm cannot further split the data.
QUESTION 6
1. Why do we need to prune a decision tree.
2. We prune a decision tree to simplify the rules generated by the algorithm.
3. In order to reduce complexity and avoid memorizing the training set, occasionally we might need to "prune" the decision tree by removing some branches.
4. When the tree is too 'bushy" we need to prune it.
QUESTION 7
1. What are the main differences between CART and C4.5? (Select all the answers that apply)
2. C4.5 allows more than two branches, while CART does not.
3. C4.5 use entropy as splitting function, while CART use a probabilistic-based function.
4. C4.5 generates node that are pure compared to CART.
5. C4.5 produces less "bushy" tree compared to CART.