Naïve bayes algorithm for text classification, Computer Engineering

Assignment Help:

Assignment 3: Naïve Bayes algorithm for text classification.

First part:

In this assignment, we will redo the task of classifying documents (assignment 2) using the same Reuter dataset. But this time, you should implement the multinomial naive Bayes algorithm instead of KNN. Naive Bayes used to be the de facto method for text classification. Try various smoothing parameters for the Naive Bayes learner. What's the accuracy of your learner? Which parameters work best?

Second Part:

In this part, you will compare between the performance of k-NN classifier and Naïve Bayes classifier for text classification.  Follow the steps below:

1. Take the best classifier from your second assignment (k-NN). Chose the best value of k and best measure of distance/similarity that gave the best performance.

2. Compare the best k-NN with Bayesian classifier. Run 50 times both the k-NN and Bayesian learner. Compute mean and standard deviation of the results. Then, compute t-statistic and at significance levels of 0.005, 0.01, and 0.05 compare which algorithm (k-NN or Bayesian) is better. Report the results in a paper and submit it.

 

 


Related Discussions:- Naïve bayes algorithm for text classification

What is clock gating, What is Clock Gating? Clock gating is one of the...

What is Clock Gating? Clock gating is one of the power-saving methods used on several synchronous circuits with the Pentium four processors. To save power, clock gating consid

Iot, what is ardiuno explain its working

what is ardiuno explain its working

Benefits of traditional hard disks and cd-rom, Q. Benefits of traditional h...

Q. Benefits of traditional hard disks and CD-ROM? CD-ROM is suitable for distribution of large amounts of data to a large number of users. CD-ROMs are a general medium these da

How adaptive transmission helps tcp to maximize connection, How adaptive tr...

How adaptive transmission helps TCP to maximize throughput on each connection? To know how adaptive retransmission helps TCP maximize throughput upon all connection, see a case

Illustrate about 8259, Illustrate about 8259 8259A adds 8 vectored pri...

Illustrate about 8259 8259A adds 8 vectored priority encoded interrupts to the microprocessor. We can expand it to 64 interrupt requests by using one master 8259A and 8 slave

Design a BCD to excess 3 code converter using NAND gates, Design a BCD to e...

Design a BCD to excess 3 code converter using minimum number of NAND gates. Hint: use k map techniques. Ans. Firstly we make the truth table: BCD no

Explain the term- looking at existing paperwork, Explain the term- Looking ...

Explain the term- Looking at existing paperwork This allows analyst to see how paper files are kept and look at operating instructions and check accounts, training manuals etc

Information system and information technology, (a) Explain the following te...

(a) Explain the following terms: Information System and Information Technology. (b) Describe the main components of a computer. Illustrate your answer by a block diagram. (c)

What is interactive reporting, What is interactive reporting? It helps ...

What is interactive reporting? It helps you to make easy-to-read lists.  You can view an overview list first that having general information and give the user with the possibil

Write Your Message!

Captcha
Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd