Naïve bayes algorithm for text classification, Computer Engineering

Assignment Help:

Assignment 3: Naïve Bayes algorithm for text classification.

First part:

In this assignment, we will redo the task of classifying documents (assignment 2) using the same Reuter dataset. But this time, you should implement the multinomial naive Bayes algorithm instead of KNN. Naive Bayes used to be the de facto method for text classification. Try various smoothing parameters for the Naive Bayes learner. What's the accuracy of your learner? Which parameters work best?

Second Part:

In this part, you will compare between the performance of k-NN classifier and Naïve Bayes classifier for text classification.  Follow the steps below:

1. Take the best classifier from your second assignment (k-NN). Chose the best value of k and best measure of distance/similarity that gave the best performance.

2. Compare the best k-NN with Bayesian classifier. Run 50 times both the k-NN and Bayesian learner. Compute mean and standard deviation of the results. Then, compute t-statistic and at significance levels of 0.005, 0.01, and 0.05 compare which algorithm (k-NN or Bayesian) is better. Report the results in a paper and submit it.

 

 


Related Discussions:- Naïve bayes algorithm for text classification

Access to external identifiers, Access to External Identifiers: An external...

Access to External Identifiers: An external identifier is one which is referred in one module though defined in another. You can declare an identifier to be external by including i

Classification of pipeline processors, Classification of Pipeline Processor...

Classification of Pipeline Processors In this part, we explain various types of pipelining that can be useful in computer operations. These types depend on the following factor

Failures, FAILURES Since reliability engineering is focused on the surv...

FAILURES Since reliability engineering is focused on the survivability or absence of failures, it is more concerned about failures,  understanding  their causes and defining re

The voltage of telephone given by telephone companies, Telephone companies ...

Telephone companies normally provide a voltage of to power telephones? Telephone companies usually give a voltage of to power telephones -48 volts DC.

Lexical analyser, The aim of this project is for you to construct a fully w...

The aim of this project is for you to construct a fully working compiler for a small simple programming language, SPL. The compiler will read in SPL source code and produce ANSI C

What are the differences between struts and units, What are the differences...

What are the differences between struts and units?  A warm up question. Units are static objects that exist from the start of the simulation right up to its end, whereas struts

Computer networking, what are the steps to implement bus topology?

what are the steps to implement bus topology?

What are the structural notations, What are the Structural Notations Th...

What are the Structural Notations These notations comprise static elements of a model. They are considered as nouns of UML model that could be conceptual or physical. Their ele

Define range which a normalised mantissa can signify, Now let's define rang...

Now let's define range which a normalised mantissa can signify. Let's presume that our present representations has normalised mantissa so left most bit can't be zero so it has to b

Advantages of instruction set architecture, Advantages of Instruction set a...

Advantages of Instruction set architecture: Stack Advantages : it is simple Model of expression evaluation (reverse polish). Contain Short instructions. Disadvanta

Write Your Message!

Captcha
Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd