Naïve bayes algorithm for text classification, Computer Engineering

Assignment Help:

Assignment 3: Naïve Bayes algorithm for text classification.

First part:

In this assignment, we will redo the task of classifying documents (assignment 2) using the same Reuter dataset. But this time, you should implement the multinomial naive Bayes algorithm instead of KNN. Naive Bayes used to be the de facto method for text classification. Try various smoothing parameters for the Naive Bayes learner. What's the accuracy of your learner? Which parameters work best?

Second Part:

In this part, you will compare between the performance of k-NN classifier and Naïve Bayes classifier for text classification.  Follow the steps below:

1. Take the best classifier from your second assignment (k-NN). Chose the best value of k and best measure of distance/similarity that gave the best performance.

2. Compare the best k-NN with Bayesian classifier. Run 50 times both the k-NN and Bayesian learner. Compute mean and standard deviation of the results. Then, compute t-statistic and at significance levels of 0.005, 0.01, and 0.05 compare which algorithm (k-NN or Bayesian) is better. Report the results in a paper and submit it.

 

 


Related Discussions:- Naïve bayes algorithm for text classification

Describe the various characteristics of udp protocol, Describe the various ...

Describe the various characteristics of UDP protocol. The characteristics of the UDP are as follows: End to end: UDP is transport protocols that can distinguish between

Process of breadth first search, Process of Breadth first search: It's...

Process of Breadth first search: It's very useful to think of this search as the evolution of the given tree, and how each string of letters of word is found via the search in

Explain turing reducibility, Explain Turing reducibility?  Exponential ...

Explain Turing reducibility?  Exponential time algorithms typically happens when we solve by searching by a space of solutions known as brute -force search

Explain the differences between logical and physical address, Explain the d...

Explain the differences between Logical and physical address space Logical Vs physical address space (1) An address produced by the CPU is commonly referred to like a logica

Convert statement into conjunctive normal form , Consider the following sta...

Consider the following statements about the types of fruit people like. If people like apples, then they do not like oranges. If people do not like apples, then they like orang

write a ''c'' program to accept any 3 digit integer number, Write a 'C' pr...

Write a 'C' program to accept any 3 digit integer number from the keyboard and display the word equivalent representation of the given number

Explain the term - strong typing and weak typing, Explain the term - Strong...

Explain the term - Strong Typing and  Weak Typing Strong Typing : When any operation upon an object can be checked during compile time, when type is confirmed forcefully.

What is control store, What is control store? The microroutines for all...

What is control store? The microroutines for all the instructions in the instruction set of a computer are kept in a special memory known as the control store.

Find minimum sampling rate of analog signal to be sampled, The analog signa...

The analog signal needs to be sampled at a minimum sampling rate of: (A) 2fs                                               (B) 1/(2fs) (C)  fs/2

Write Your Message!

Captcha
Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd