Assignment Help >> Software Engineering
Project Exercises
Part I. Construct training data files (ARFF files) using the training image data for three different bin numbers (i.e., number_of_bins = 8, 64 and 512). The number of training data files should be three.
Part II. Construct the five different classifier models using each training data file. The five classification methods are as follows:
1) Naïve Bayes Classifier
2) C4.5 Classifier
3) k-Nearest-Neighbor Classifiers
4) Multilayer Neural Network
5) Support Vector Classifier
Part III. Construct test data files (ARFF files) using the test image data per each category for three different bin numbers. The total number of test data files should be 21 (=7*3) in this case.
Part IV. Compare the prediction accuracies among five different classifiers for each category
Part V. Construct test data files (ARFF files) using all test image data for three different bin numbers. The total number of test data files should be three in this case. (You can easily construct this three test data files by combining the test data files constructed in Part III)
Part VI. Compare the prediction accuracies among five different classifiers for overall test data
- You CAN compute color histograms and construct ARFF files
a) Manually by using MS-Excel and/or any text editor (wordpad, textpad, etc)
b) Automatically by developing your own program with any programming language such as C, C++, Java, etc.
Part VII. Project Submission.
1. A project report (PDF file observe CSC573 presentation standards) describing
1. A comprehensive description of each classifier.
2. Accuracy comparison for each category preformed in Part IV
3. Accuracy comparison for overall test image preformed in Part VI
4. Your conclusions based on your observations
Preview Container content
Contents
Naïve Bayes Classifier 3
C4.5 Classifier 3
K-Nearest-Neighbor Classifiers 4
Multilayer Neural Network 4
Support Vector Classifier 5
Code 5
Comparison and Conclusion 11
Naïve Classifier works on the probabilistic distribution which can set a standard way on the text retrieval system. The categorization and documents judgment is based on handling the models which can match the problems. The instance of Bayes classification depends on the values and classify to hold the probability model efficiently based on parameterized estimation. The Bayesian probability directs to hold the probabilistic output which can outline important pattern of formulation and growth.
It is important for building up a decision tree which was earlier identified using the ID3 algorithm. There are statistical classification that directs to set up the information entropy system and represent the attributes of data depending upon effective splitting. The entropy enrichment is normalized with usage of the samples which are in same class. The creation of lead nodes are important for the information gain pattern. The encountering of different nodes direct to handle the expected value.