Reference no: EM132514428
Question 1. A classification model is applied to a dataset with 120 transaction records. Below is the confusion matrix for the model.
|
|
Predicted class
|
|
|
|
Class = 1
|
Class = 0
|
|
Actual class
|
Class = 1
|
40
|
25
|
?
|
Class = 0
|
20
|
35
|
?
|
|
|
?
|
?
|
120
|
1.1. Fill in the blanks marked with "?"
1.2. Calculate the following for this model:
• accuracy
• error rate
• sensitivity
• specificity
• precision
• recall
Write the formulas and show the calculation steps. Marks will be deducted if you do not write the formula and/or do not show the calculation steps.
Question 2. A classification model is applied to a dataset on tax frauds, where success class is fraud= ‘yes' and non-success class is fraud = ‘no'. The classifier classifies 98 records as fraud = ‘yes' out of which 41 records are correctly classified, and 1042 records as fraud
= ‘no' out of which 998 are correctly classified.
2.1. How many records are there in the data set in total? Draw the confusion matrix and complete it with appropriate values.
2.2. Calculate the following for this model:
• Number of actual positives
• Number of actual negatives
• Number of predicted positives
• True positive rate
• True negative rate
• Recall
• Precision
Write the formulas and show the calculation steps. Marks will be deducted if you do not write the formulas and/or do not show the calculation steps.
Question 3. There are two classification models: C1 and C2. The ROC curves for these two classifiers are shown in the following diagram. The dotted straight line is the ROC for random classification, as you know.
Suppose the AUC for C1 is 0.785 and the AUC for C2 is 0.799.
3.1. Which classification model is better? C1 or C2? Justify your answer.
Attachment:- Predictive Analytics.rar