Create a classification model for letter recognition

Assignment Help Data Structure & Algorithms
Reference no: EM131063551

Problem 1: Download the letter recognition data from: https://archive.ics.uci.edu/ml/datasets/Letter+Recognition

The objective is to identify each of a large number of black-and-white rectangular pixel displays as one of the 26 capital letters in the English alphabet. The character images were based on 20 different fonts and each letter within these 20 fonts was randomly distorted to produce a file of 20,000 unique stimuli. Each stimulus was converted into 16 primitive numerical attributes (statistical moments and edge counts) which were then scaled to fit into a range of integer values from 0 through 15. Below is the attribute information, but more information on the data and how it was used for data mining research can be found in the paper:

P. W. Frey and D. J. Slate. "Letter Recognition Using Holland-style Adaptive Classifiers". (Machine Learning Vol 6 #2 March 91)

Attribute Information:

1. lettr capital letter (26 values from A to Z)

2. x-box horizontal position of box (integer)

3. y-box vertical position of box (integer)

4. width width of box (integer)

5. high height of box (integer)

6. onpix total # on pixels (integer)

7. x-bar mean x of on pixels in box (integer)

8. y-bar mean y of on pixels in box (integer)

9. x2bar mean x variance (integer)

10. y2bar mean y variance (integer)

11. xybar mean x y correlation (integer)

12. x2ybr mean of x * x * y (integer)

13. xy2br mean of x * y * y (integer)

14. x-ege mean edge count left to right (integer)

15. xegvy correlation of x-ege with y (integer)

16. y-ege mean edge count bottom to top (integer)

17. yegvx correlation of y-ege with x (integer)

Create a classification model for letter recognition using decision trees as a classification method with a holdout partitioning technique for splitting the data into training versus testing.

a. Changing the values for the depth, number of cases per parent and number of cases per leaf produces different tree configurations with different accuracies for training and testing. Choose at least five different configurations and report the accuracy for training and testing for each one of them.  Which configuration will you choose as the best model? Explain your answer.

b. For the best tree configuration, report the misclassification matrix and interpret it.  In your opinion, is accuracy a good way to interpret the performance of the model?  If not, suggest other measures.

c. What are the most important three attributes for recognizing the letters?

Problem 2: On the same data from Problem 1, apply a K-nearest neighbor classifier to classify the data.  Report the following:

1. If you are doing any data transformation, explain the transformation and why it is needed.

2. Report the misclassification matrix and the appropriate performance metrics for different values of K (K=1, 3, 5, and 7). 

3. Interpret the results and also compare them with the ones obtained by using the decision trees.

Reference no: EM131063551

Questions Cloud

Labor force participation rate : Suppose we have a working age population is equal to 100 million. If the number of employed is equal to 60 million and the number of unemployed is equal to 3 million, what is the labor force participation rate?
Question regarding the increase and decrease in value : How do changes in the value of the U.S. dollar impact Apple Inc? Please list examples of the impact on Apply Inc when the U.S. dollar has an increase and decrease in value? Please provide one reference.
Assume both bonds are selling at a premium : Which one of these is included in the yield of a bond with a low credit rating but not included in a U.S. Treasury bond yield? Assume both bonds are selling at a premium.
Working age population : If there are 13 million unemployed and a working age population of 160 million, what would the number of employed be to arrive at a labor force participation rate of 64%?
Create a classification model for letter recognition : Create a classification model for letter recognition using decision trees as a classification method with a holdout partitioning technique for splitting the data into training versus testing
In addition to common-size financial statements : In addition to common-size financial statements, common-base-year financial statements are often used. Common-base-year financial statements are constructed by dividing the current year account value by the base year account value.
Replacement analysis : St. Johns River Shipyard's welding machine is 15 years old, fully depreciated, and has no salvage value. However, even though it is old, it is still functional as originally designed and can be used for quite a while longer. What is the NPV of the pr..
Allocate between leisure and work : 1. A worker has 24 hours per day to allocate between leisure and work.  Use graphs to answer the following questions. a. If leisure is a normal good, show how it is possible to derive a negatively-sloped labor supply curve. Explain how this is poss..
Strategic analysis portfolio : Strategic Analysis Portfolio, The two companies that have been chosen are Apple Inc. and Samsung. Both the companies are in the same market domain and are relevant to the study owing to the nature of their business and strategic decisions that are be..

Reviews

Write a Review

Data Structure & Algorithms Questions & Answers

  Input a list of employee names and salaries and determine

input a list of employee names and salaries and determine the meanaverage salary as well as the number of salaries

  Calculations on rows and columns of an array

Make a menu bar with a document menu that includes a Perform Action command and an Exit command. The Perform Action command calculates either the sum or the average of rows or columns in array and displays result in a message box.

  Interchange contents of working registers

Make a stack at 1000h and use the stack to interchange the contents of all of working registers. Exchange AX with DX, BX with CX, and DI with SI.

  Decision tree learninga describe the main steps in the

decision tree learninga describe the main steps in the basic decision tree learning algorithm. the table below contains

  Question 1a for n 0 what is the time complexity of the

question 1a for n ? 0 what is the time complexity of the method q1 n. show the details of your calculation of oq1 n

  Setup an example rsa public/private key pair using primes

RSA with three primes would also work: n = pqr, ?(n) = (p?1)(q?1)(r?1), gcd(e, ?(n)) = 1, and d = e^?1 (mod ?(n)).

  Creating decision tree

Premium Airlines has currently offered to settle claims for a class action suit, which was originated for alleged price fixing of tickets. The settlement is stated as follows. Create a decision tree for this condition.

  Design a bfs-based algorithm

Design a BFS-based algorithm (pseudo code) for directed graph that computes the total number of paths from vertex srcU to vertex destV.

  Explain the purpose of the program as detail as possible

Count the amount of words in the file. A word can end with a --- space, EOLN character or a punctuation mark (which will be part of the word).

  An undirected graph g is called bipartite

An undirected graph G is called bipartite if its vertices can be partitioned into two sets X and Y such that every edge in G has one end vertex in X and one end vertex in Y

  Substituted the following expression

can be substituted for. if (isalpha(c) && isdigit(c)) a) if...  The following expression can be substituted for. if (isalpha(c) && isdigit(c))  a) if (isalnum(c)) b) if (isalphanum(c))

  How many students need to be entered

Write a program that would allow a user to enter student names and Final grades (e.g. A,B,C,D,F) from their courses. You do not know how many students need to be entered

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd