Describe the classification rule method

Assignment Help Advanced Statistics
Reference no: EM132515484

Data Mining: Basic Methods and Techniques

Laboratory Assignment:

Part 1. Describe the Classification Rule method.

Part  2. Use the Classification rule production method (Classify Tab-Rules Folder-JRip)on the Weather.nominal data set. How many rules did it produce? Compare this to the Decision tree produced on the same data. What is the difference between the two models?

Part  3. Describe the K-nearest neighbor method.

Part  4. Produce a K-NN model (classifiers.lazy.IBk) for Weather.numeric data set.
The standard K-nearest neighbor method can be found in the ‘lazy' submenu of the list presented when you click ‘Choose' in Explorer's Classify window. It is called ‘IBk'. Select this and then click on IBk so you can modify the parameters. The default value of k is 1. Set it to 3 (or other value of your preference) and then click Start to run the programs.

What is the output? How many instances did it classify correctly and how many incorrectly?
• Try changing the parameter K - the number of neighbors. Did that influence the model's performance?
• Try using different weighting schemes. Did does this change influence the model's performance?

Part  5. Upload the soybean.arff data set. Before running Weka, it is worth having a brief look at the data file under the Preprocess tab click Edit button. Alternatively, you can take a look at the data file using a text editor (Notepad or WordPad would work). Lines beginning with % are comments. Typically the beginning of the file provides background information on the data set. This includes details of the data itself and references to previous work using the data. The Soybean file contains 683 examples, each of which has 35 attributes plus the class attribute. The task is to assign examples to one of 19 disease classes. Apply the k-nearest neighbor classifier to the soybean data set.

What % of examples are correctly classified?Compare the result to the same result of the unpruned decision tree procedure. Try investigating the effect of repeating the run with different values for k. Compare and contrast the 2 methods and their outputs.

Reference no: EM132515484

Questions Cloud

Prepare a journal to record the exchange : At the time of this exchange, the market price of the engine was Rp5,500,000. Prepare a journal to record the exchange, the estimated age of the machine
ME606 Digital Signal Processing Assignment : ME606 Digital Signal Processing Assignment Help and Solution, Melbourne Institute of Technology - Assessment Writing Service
HC1072 Economics and International Trade Assignment : HC1072 Economics and International Trade Assignment Help and Solution, Holmes Institute - Assessment Writing Service - Develop a broad understanding
300976 Technologies for Mobile Applications Assignment : 300976 Technologies for Mobile Applications Assignment Help and Solution, Western Sydney University - Assessment Writing Service
Describe the classification rule method : Describe the Classification Rule method and Describe the K-nearest neighbor method - Produce a K-NN model (classifiers.lazy.IBk) for Weather.numeric data set
Analyse system functionality : Analyse system functionality and Review and update technical and user documentation for at least TWO systems or occasions
Explain what nutrition is and why it is important : Explain what nutrition is and why it is important and Describe the characteristics of a healthy diet and provide supporting examples
Differences between the three types of intervention : Explain the differences between the three types of intervention in group work: Interpersonal. Intrapersonal. Environmental and Cognitive Restructuring
Demonstrating the principles of data merging : Demonstrating the principles of data merging, RESTful Web Services and Mashups - explaining the principles of data merging, RESTful Web Services and Mashups

Reviews

Write a Review

Advanced Statistics Questions & Answers

  What is the probability that there is no storm in january

What is the probability that there is no storm in january and what is the probability that there is no damage-inducing storm in january

  Break-even point-dependent and independent

Would you please show us the calculation of following problem for breakeven point below? Suppose that a company has fixed costs of $150,000 and variable costs of $7.5 per unit. What is the break-even point if the selling price is $12.5 per unit?

  Compute the probability mass function of random variable

Let T be the time at which the first black card is turned over. Compute the probability mass function of this random variable

  Find the total number of Skittles in our data

Project: Using R for Chapters 1-4. Goal: Exploring Correlation and Regression. Find the total number of Skittles in our data and put the answer in cell G1

  Represent some attributes or variables on a single screen

Our main aim is to represent some attributes or variables on a single screen, so that they can visualise their data on one screen

  Future value-concept and application

In a Word document, upload your answers to the following questions below. Very importantly - show all your work. If your final answer is wrong, you can still receive partial credit if you show all of your steps and demonstrate a good understanding..

  Calculate the unit contribution margin for each product

ACC544 - Decision Support Tools - Calculate the unit contribution margin for each product and this year the manufacturer will specialise in making only Road bikes. How many does he need to sell to b

  Description of levered beta

Company has an unlevered beta of 1.10, no debt, but is considering changing it's capital structure to be 30% debt and 70% equity, corporate tax rate is 40%, what is the levered beta?

  Construct the equation of the regression line

Construct the equation of the regression line and interpret the coefficients and using the output of the regression above, determine the slope and the intercept.

  Uestion about quantitative analysis

Color View is a manufacturer of color monitors for personal computers. The company uses the EOQ model with gradual replenishment to determine the production lot sizes for its various models.

  Joint processing and split off points

India Corporation has $200,000 of joint processing costs and is studying whether to process J and K beyond the split-off point. Information about J and K follows.

  Frequency distribution of a variable and bar graph

Descriptives of a continuous : mean, median, mode, skewness, kurtosis, standard deviation and cross tabulation of two variables

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd