Naïve bayes algorithm for text classification, Computer Engineering

Assignment Help:

Assignment 3: Naïve Bayes algorithm for text classification.

First part:

In this assignment, we will redo the task of classifying documents (assignment 2) using the same Reuter dataset. But this time, you should implement the multinomial naive Bayes algorithm instead of KNN. Naive Bayes used to be the de facto method for text classification. Try various smoothing parameters for the Naive Bayes learner. What's the accuracy of your learner? Which parameters work best?

Second Part:

In this part, you will compare between the performance of k-NN classifier and Naïve Bayes classifier for text classification.  Follow the steps below:

1. Take the best classifier from your second assignment (k-NN). Chose the best value of k and best measure of distance/similarity that gave the best performance.

2. Compare the best k-NN with Bayesian classifier. Run 50 times both the k-NN and Bayesian learner. Compute mean and standard deviation of the results. Then, compute t-statistic and at significance levels of 0.005, 0.01, and 0.05 compare which algorithm (k-NN or Bayesian) is better. Report the results in a paper and submit it.

 

 


Related Discussions:- Naïve bayes algorithm for text classification

Is the data bus is bi-directional, The data bus is Bi-directional because t...

The data bus is Bi-directional because the similar bus is used for transfer of data among Micro Processor and memory or input / output devices in both the direction.

Entropy - learning decision trees, Entropy - learning decision trees: ...

Entropy - learning decision trees: Through putting together a decision of tree is all a matter of choosing that attribute to test at each node in the tree. Further we shall de

How the at-user command serves mainly in lists, How the at-user command ser...

How the at-user command serves mainly in lists? The AT USER-COMMAND event serves mostly to handle own function codes.  In this case, you should make an individual interface wit

Arterial puncture - specimen collection, Arterial puncture - Specimen colle...

Arterial puncture - Specimen collection: Arterial puncture:    this requires special skill and usually performed only by physician. The preferred site is radial arter

Explain the criteria to classify data structures, Explain the criteria to c...

Explain the criteria to classify data structures used for language processors? The data structures utilized in language processing can be classified upon the basis of the subse

What are the techniques of data collection, What are the techniques of Data...

What are the techniques of Data Collection It can be either automatic or manual. Manual techniques can include: -  Keypads/Keyboards to type in data -  touch screens to s

Potential of parallelism-parallel computing, Potential of Parallelism P...

Potential of Parallelism Problems in the actual world differ in respect of the degree of natural parallelism inherent in the personal problem domain. Some problems may be simpl

State the optimal route of node, State the optimal route of node Consid...

State the optimal route of node Consider the node i which has path length K+1, with the directly preceding node on the path being j. The distance to node i is w(j, i) plus the

Explaintask and parallel task, Task   A task is logically discrete ...

Task   A task is logically discrete section of computational work. A task is normally a program or else set of instructions which are executed by a processor. Parallel

Microprocessors Instruction sets, Write a program to mask bits D3D2D1D0 and...

Write a program to mask bits D3D2D1D0 and to set bits D5D4 and to invert bits D7D6 of the AX register.

Write Your Message!

Captcha
Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd