Naïve bayes algorithm for text classification, Computer Engineering

Assignment Help:

Assignment 3: Naïve Bayes algorithm for text classification.

First part:

In this assignment, we will redo the task of classifying documents (assignment 2) using the same Reuter dataset. But this time, you should implement the multinomial naive Bayes algorithm instead of KNN. Naive Bayes used to be the de facto method for text classification. Try various smoothing parameters for the Naive Bayes learner. What's the accuracy of your learner? Which parameters work best?

Second Part:

In this part, you will compare between the performance of k-NN classifier and Naïve Bayes classifier for text classification.  Follow the steps below:

1. Take the best classifier from your second assignment (k-NN). Chose the best value of k and best measure of distance/similarity that gave the best performance.

2. Compare the best k-NN with Bayesian classifier. Run 50 times both the k-NN and Bayesian learner. Compute mean and standard deviation of the results. Then, compute t-statistic and at significance levels of 0.005, 0.01, and 0.05 compare which algorithm (k-NN or Bayesian) is better. Report the results in a paper and submit it.

 

 


Related Discussions:- Naïve bayes algorithm for text classification

Cutoff search - artificial intelligence, Cutoff Search : To require a ...

Cutoff Search : To require a mini and max search in a game on stage situation is, in which all we have is just do that programme our agent to look at the intact that search tr

What is clr, What is CLR?  CLR is .NET equivalent of Java Virtual Mach...

What is CLR?  CLR is .NET equivalent of Java Virtual Machine (JVM). It is the runtime that changes a MSIL code into the host machine language code, which is then implemented a

Pebble merchant, c programming code for pebble merchant

c programming code for pebble merchant

Determine frame time and propagation time in a lan, Maximum channel utiliza...

Maximum channel utilization in a LAN is defined by frame time (t f ) and propagation time (t p ). It is defined by (A) t p /t f (B) t f /t p  (C) 1 + (t f /t p )

CRTscreen as a two-dimensional matrix, Explain the statement- CRT screen as...

Explain the statement- CRT screen as a two-dimensional matrix One can imagine the CRT screen as a two-dimensional matrix which has m rows and n  columns and this is usually ref

Describe the fundamental characteristics of antennas, Question 1: (a) ...

Question 1: (a) Describe the two fundamental characteristics of antennas explaining in detail how it affects the security of wireless networks. (b) What is a wireless cli

Name the appliances which are controlled by micro-processor, Name the appli...

Name the appliances which are controlled by micro-processor Many household appliances which are microprocessor-controlled do not have operating systems (for example microwave o

What do you mean by underflow and overflow of data, What do you mean by und...

What do you mean by underflow and overflow of data? Underflow and overflow of data: When the value of the variable is either too long or too small for the data type to hold,

#title.linear programming, zmax=7.5x1-3x2 subject to constraints 3x1-x2-x3>...

zmax=7.5x1-3x2 subject to constraints 3x1-x2-x3>=3 x1-x2+x3>=2 x1,x2,x3>=0

Create a system dynamic model, Easter Island is a small island (about 150 s...

Easter Island is a small island (about 150 square miles in area) in the Pacific Ocean about 2,000 miles from South America. In about 400 AD there was a small population of settlers

Write Your Message!

Captcha
Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd