Data mining project

Assignment Help Other Subject
Reference no: EM132514918

DATA MINING PROJECT

Select the leaf data set for our data mining project

Business and data understanding

Leaf data set is multivariate data. It includes 40 different plant spices. It contains 340 instances and 16 attribute. Name of included attributes are given below
• Class(Species)
• Specimen Number
• Eccentricity
• Aspect Ratio
• Elongation
• Solidity
• Stochastic Convexity
• Isoperimetric Factor
• Maximal Indentation Depth
• Lobedness
• Average Intensity
• Average Contrast
• Smoothness
• Third moment
• Uniformity
• Entropy

In these attributes different plants are classified into 36 classes and specimen numbers represents that how many spacemen produced by the plant according to the class whereas rest of attributes explains the characteristics of plant. The data type of all attributes is real. All values of attributes are available in this data set. Associated task of given data set is classification. We classify the class of different plants according to the characteristics of leaf.

Modelling
According to the CRIS-DM model next phase is modelling. In this phase we select the operator that will be useful to do the classification of class attribute. We make an effort to apply the four different operators such as decision tree, Random Forest Tree, Deep Learning, and Naive Bayes. Then, Naive Bayes are unsuccessful for leaf data set because the data is in numerical form and Naive Bayes cannot handle the numerical data. As a result, we select decision tree, random forest tree, and deep learning operator to use in the process.

Next, we select the root mean squared error criteria to analyse the result. In addition to this, we select other useful operators to complete the process such as select attribute, set role, split data, apply model, and performance.

Select attribute operator: we used this operator to select the attributes for our process. This operator allows selecting which attributes should be a part of resulting.

Set role operator: This operator to set the role of attribute and target the role as label.
Split data: This operator split up the data into two or more examples sets or separates the data.
Apply model: This model is trained by another operator to the example set. The goal of this model is to predict or classify invisible data and transform the data by applying a pre-processing model.
Performance: This operator is used for statistical performance evaluation of classification tasks. This operator provides a list of performance criteria for the classification task.

Evaluation
This is the evaluation phase. In this phase, we use the operators in the Rapid miner software for the objective.

1. Decision tree: A decision tree is a set of trees, similar to set of nodes, designed to create decisions about classifying values into classes or estimating numerical target values. Each node represents a split rule for a particular attribute. This operator can handle an example set containing nominal and numeric properties. In this process, firstly import the data.

Attachment:- Data mining project.rar

Reference no: EM132514918

Questions Cloud

Random samples of the same sample size : Assume 50 random samples of the same sample size are taken from a population, and a 90% confidence interval is constructed from each sample.
Estimate the number of pistons in the sample : A machine that manufactures automobile pistons 1% is estimated to produce a defective piston of the time. Suppose that this estimate is correct
Examine pairwise correlations between variables : What R command accomplishes this simultaneously for all variables in your data frame called "mydata"?
Use of caffeine lowered birth weights : A researcher studied whether pregnant women who consumed more than 800 mg of caffeine per day had babies with a lower birth weight (in lbs).
Data mining project : Select the leaf data set for our data mining project - In these attributes different plants are classified into 36 classes and specimen numbers represents
What is the stocks value : Assume that the dividend growth rate estimate is increased to a constant 7 percent per year. What is the stock's value
ICT616 Data Resources Management Assignment : ICT616 Data Resources Management Assignment Help and Solution, Murdoch University - Assessment Writing Service
Government inspector tests the drinking water : A government inspector tests the drinking water at 10 locations around a certain city and finds the following levels of salinity (in parts per million of course
Prepare the entry Lucy should make on June : Lucy factored $2,500,000 of accounts receivable with Ethel on a without recourse basis on May 1. Prepare the entry Lucy should make on June

Reviews

Write a Review

Other Subject Questions & Answers

  The criminal justice system does the criminal justice agency

In what component of the criminal justice system does the criminal justice agency or individuals in question belong?

  Assume the population

Assume the population of the United States increases by 1%/yr. How many BTU of energy will have to be added to the national annual energy budget this year to maintain the same per capita expenditure? What is this in gallons of petroleumm if it all co..

  What do the plans require private employers to do

What do the plans require private employers to do

  Does the argument match with what you know of the topic

What suggestions do you have for the author to improve the argument?Did he/she forget to add any significant information (if so, what)?

  Discuss the terms endocrinology and neurology

By Wednesday, June 21, 2017, you will write the final 2 reports, referring to the departments of Endocrinology and Neurology and use them as your script.

  Describe some methods for motivating employees

Write a three to five (3-5) page paper in which you: Describe some methods for motivating employees and improving behaviors within the workplace

  What would be the expected outcomes

Choose an organization and identify an area in which it needs to communicate a change. What type of change agent and change model needs to be implemented?

  Which program would be most appropriate for john

Imagine that you are developing a prerelease plan for John Doe, who will be eligible for parole in nine months. John is 22 years old and dropped out of high school when he was 16 (a junior) to care for his sick mother and go to work to help his fa..

  Evaluate the strategies and outcomes of your department

Read the "Staying on Course with Strategic Metrics" article. As a future health care administrator, you will be required to evaluate the strategies, efficiency.

  Assignment of accident settlement

After being injured in a motorcycle accident, Gary retained the law firm of High & Low to represent him in suing the other driver. Later, in an unrelated incident, Gary dislocated his shoulder and required surgery.

  Interpret the role of a flat character

Interpret the role of a flat character in "Young Goodman Brown" or "Rip Van Winkle." In writing this interpretive essay, consider how your chosen flat character influences the action in your chosen short story.

  Write a software recommendations report

In the report, be sure to justify your recommendations by analyzing the pros and cons of each of the six software systems you evaluated.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd