Reference no: EM132484547
Supervised Learning - OneR models
Part 1. Because we all know that more data is better, I have merged your survey data with that of the previous DM_2018 cohort.
Using the HW5_survey1820.xls dataset:
? Draw a 1R tree for each student Q-attribute (i.e. Q2 - Q8) to predict their rating for the 'ethical' descriptor to the Wired article's subject. What is each tree's error-rating?
? Draw a 1R tree to predict each student Q-attribute from answers for the descriptor 'deceitful'. What is each tree's error-rating?
? In each approach a and b above, which tree(s) seem to give the "best" (i.e. most trustworthy) results? What is the second-best tree for each? Why might you tend to prefer the 2nd-best tree's results to the "best" tree's results?
? Work up and describe the general profile of the person who rates the 'deceitful' descriptor as a 1. As a 2. As a 3. How confident are you of these resulting profiles' accuracy, as applied to CS majors generally?
? Which do you feel to be the most accurate, best-predictive trees?
Part 2. Using the Iris data in HW3_train.csv, discretize each of the numeric ranges for attributes A through D. Draw the resulting 1-R trees and their accuracy ratings according to your training data.
? Use 6 as your minimum-majority value -- that is the minimum size of your majority class in each discrete subset of your numeric data
? Determine the accuracy of each according to the testing set HW3_test.csv.
? Incorporate these results into a table including your accuracy-results from HW3 (K-Means and Fuzzy Classification models). Discuss these results and how they compare.
Part 3. Download the WEKA application to your computer. Use its Explorer module to select the full Iris dataset (provided by WEKA as an ARFF file). Use the entire dataset as a training-set; do not worry about using a test-file. Use WEKA's OneR classifier to determine the best single-attribute predictor for irises.
Add this result to your model accuracy-comparison table from #2 above, and discuss its placement among the previous three.
Survey Questions
For your reference, these were the survey questions in the Qualtrics survey you took earlier this semester:
Question 1 Rate the following attributes on a scale of 1 (least applicable) to 3 (most applicable)
:
:
Question 2 Are you currently (or have you been) in a long-term relationship?
Question 3 What is your gender?
Question 4 Are you a CS major?
Question 5 Are you 22 years old (or older)?
Question 6 Is/Was your hometown community of population 20,000 or less?
Question 7 Is your most recent cumulative GPA 3.0 or above?
Question 8 Will you have graduated by Summer '20?
Attachment:- Supervised Learning.rar