Naïve bayes algorithm for text classification, Computer Engineering

Assignment Help:

Assignment 3: Naïve Bayes algorithm for text classification.

First part:

In this assignment, we will redo the task of classifying documents (assignment 2) using the same Reuter dataset. But this time, you should implement the multinomial naive Bayes algorithm instead of KNN. Naive Bayes used to be the de facto method for text classification. Try various smoothing parameters for the Naive Bayes learner. What's the accuracy of your learner? Which parameters work best?

Second Part:

In this part, you will compare between the performance of k-NN classifier and Naïve Bayes classifier for text classification.  Follow the steps below:

1. Take the best classifier from your second assignment (k-NN). Chose the best value of k and best measure of distance/similarity that gave the best performance.

2. Compare the best k-NN with Bayesian classifier. Run 50 times both the k-NN and Bayesian learner. Compute mean and standard deviation of the results. Then, compute t-statistic and at significance levels of 0.005, 0.01, and 0.05 compare which algorithm (k-NN or Bayesian) is better. Report the results in a paper and submit it.

 

 


Related Discussions:- Naïve bayes algorithm for text classification

Size of scripts and libraries, Main script section ("Sub Main .. End Sub) a...

Main script section ("Sub Main .. End Sub) and function bodies should fit within an A4-page (approx. two monitor-pages). If the code doesn't fit it is a candidate to do more decoup

What are vectored interrupts, What are vectored interrupts? To decrease...

What are vectored interrupts? To decrease the time involved in the polling process, a device requesting an interrupt may recognize itself directly to the processor. Then the pr

One can use the event get in a report without ldb attribute, One can use th...

One can use the event GET in a report without LDB attribute. False. no one can use the event GET in a report without attribute.

What is multiple interrupt lines, Q. What is Multiple Interrupt Lines? ...

Q. What is Multiple Interrupt Lines? Multiple Interrupt Lines: Simplest solution to problems above is to provide multiple interrupt lines that will result in immediate recognit

Illustrate control and timing signals, Q. Illustrate control and timing sig...

Q. Illustrate control and timing signals? The requirement of I/O from different I/O devices by processor is quite unpredictable. In fact it relies on I/O needs of particular pr

Analysis of website, In this part you are required to review and critique a...

In this part you are required to review and critique a website of a café or a restaurant of your choice. Your report should be a minimum 500 words with a maximum of 1000 words. You

Programming in c, Write a program to find the area under the curve y = f(x)...

Write a program to find the area under the curve y = f(x) between x = a and x = b, integrate y = f(x) between the limits of a and b. The area under a curve between two points can b

Program for interchanging the values of two memory locations, Q. Program fo...

Q. Program for interchanging the values of two Memory locations? Program for interchanging the values of two Memory locations  ; input: Two memory variables of same size:

What is mmu, What is MMU? MMU is the Memory Management Unit. It is a sp...

What is MMU? MMU is the Memory Management Unit. It is a special memory control circuit used for executing the mapping of the virtual address space onto the physical memory.

Write Your Message!

Captcha
Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd