Implement a decision tree and naïve bayes classifier

Assignment Help Computer Engineering
Reference no: EM131672602

Assignment -

1. Machine learning has now permeated multiple disciplines, even politics. The current landscape in the US is rife with data scientists and other quantitative experts making predictions about ongoing and upcoming elections. Consider the Congressional Voting Records dataset from the UCI machine learning repository

The dataset contains two files: one with a ".names" suffix and one with a ".data" suffix. The actual data is in the ".data" suffix and ".names" describes the metadata (i.e., describes what the different columns mean). Note that each row of the ".data" file contains one instance and includes both features and the class label (please take care to note the order). The machine learning problem here is to take the votes of US congressmen/congresswomen as input and predict whether they are a Republican or a Democrat. In particular, our goal is to solve this problem using both decision trees and a naïve Bayes classifier.

First, spend some time understanding the structure of the dataset, how the instances are organized, how the features/class are organized, and so on. You need to "massage" this data into the form that scikit-learn requires before you can apply either a decision tree or a naïve Bayes classifier. So spend some time understanding and planning how you will do this massaging. You can do this in Python or in Excel or any way you choose. Note that this step is a natural part of the machine learning and knowledge discovery process. Data is rarely given in the form that machine learning can be directly applied, so that considerable effort goes into cleaning, manipulating, and massaging it. Do not apply scikit-learn before ensuring that it is in the form required.

Just like the PlayTennis dataset, the features are binary-valued but note that some features have missing values for some rows (instances). You need to decide how you will handle them. There are three possibilities here: i) discard instances that have missing feature values, ii) treat "missing" as if it is a value (and thus a binary feature becomes a ternary, or three-valued, feature), iii) impute missing values (i.e., for each feature, replace missing values with the most common value for that feature), so that they are no longer missing or unknown. If you read the ".notes" file, it explains why some values are missing and what they mean.

  • Implement a decision tree and Naïve Bayes classifier for classification, with each of the above three ways of dealing with missing values. So you are experimenting with 6 scenarios.
  • Perform 5-fold cross validation and report precision, recall, and F1-scores for each of the 6 scenarios.

2. For what type of dataset would you choose decision trees as a classifier over Naive Bayes? Vice versa?

Verified Expert

This assignment is based on the concepts of machine learning. In this task, we have to compare the two algorithms for the given House Dataset. The algorithms are Decision Tree and Naive Bayes. We have to compare the parameters like F-measure, Precision, Recall etc.

Reference no: EM131672602

Questions Cloud

Calculate the accrual accounting rate of return : NPV and AARR, goal-congruence issues. Liam Mitchell, a manager of the Plate Division for the Harvest Manufacturing company, has the opportunity to expand.
Process diagram to aid in performing enabler analysis : Explain how the HPS model interacts with a process diagram to aid in performing an Enabler Analysis.
Determine the fundamental period and the value coefficients : Determine the fundamental period of x(t) and the value of the coefficients of the Fourier series of x(t) - how much power is carried by the harmonic component
How well the data collection process worked or did not work : How well the data collection process worked or did not work, how you compiled the data for the purposes of feedback, and the subsequent analysis.
Implement a decision tree and naïve bayes classifier : CS 5644 Assignment. Implement a decision tree and Naïve Bayes classifier for classification, with each of the above three ways of dealing with missing values
How should galveston manage frank griswold concerns : The additional number of stockouts under the new JIT system is estimated to be 5% of the total number of shipments annually. Ten thousand shipments are budgeted
Process diagramming and definitions of defined enablers : Many organizations change so frequently that they are unwilling to invest in the process diagramming and definitions of defined enablers.
Explain the predictors of naturalistic sexual aggression : Under which conditions did men and women recall the advertised products? How did the experimenters explain the reasons for this outcome?
What is the point of all our detailed dcf analysis : Bill Watts, president of Western Publications, accepts a capital budgeting project proposed by division X.

Reviews

inf1672602

1/15/2018 4:53:04 AM

The programmer was good to go with. I have receivd my work within the mentioned time period. I am happy with the team efforts they did for me. thanks a ton.

inf1672602

12/8/2017 5:09:58 AM

How will I ensure uniqueness of the content? I assume I am not the only requestor of the same question We provide the complete unique solution to all our clients and to assure the uniqueness we also provide them with plagiarism report. "For what type of dataset would you choose decision trees as a classifier over Naive Bayes? Vice versa?"

len1672602

10/9/2017 2:50:54 AM

What to submit: Exactly one zipped file containing: A PDF document summarizing answers to questions 1 and 2. Do not submit pages and pages of code. Instead distill your lessons and experiences succinctly. And Either hyperlinks to or actual attachments of your data files and your iPython notebook(s).

Write a Review

Computer Engineering Questions & Answers

  Mathematics in computing

Binary search tree, and postorder and preorder traversal Determine the shortest path in Graph

  Ict governance

ICT is defined as the term of Information and communication technologies, it is diverse set of technical tools and resources used by the government agencies to communicate and produce, circulate, store, and manage all information.

  Implementation of memory management

Assignment covers the following eight topics and explore the implementation of memory management, processes and threads.

  Realize business and organizational data storage

Realize business and organizational data storage and fast access times are much more important than they have ever been. Compare and contrast magnetic tapes, magnetic disks, optical discs

  What is the protocol overhead

What are the advantages of using a compiled language over an interpreted one? Under what circumstances would you select to use an interpreted language?

  Implementation of memory management

Paper describes about memory management. How memory is used in executing programs and its critical support for applications.

  Define open and closed loop control systems

Define open and closed loop cotrol systems.Explain difference between time varying and time invariant control system wth suitable example.

  Prepare a proposal to deploy windows server

Prepare a proposal to deploy Windows Server onto an existing network based on the provided scenario.

  Security policy document project

Analyze security requirements and develop a security policy

  Write a procedure that produces independent stack objects

Write a procedure (make-stack) that produces independent stack objects, using a message-passing style, e.g.

  Define a suitable functional unit

Define a suitable functional unit for a comparative study between two different types of paint.

  Calculate yield to maturity and bond prices

Calculate yield to maturity (YTM) and bond prices

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd