Plot six training points in a two-dimensional space

Assignment Help Data Structure & Algorithms
Reference no: EM13856373

Note: Please include your name and the "Certification of Authorship" (located on Blackboard) form in EVERY document you submit. Thanks.

Part 1. Text Reading:

Decision Trees (Chap. 18 Sec 18.3), Reasoning with uncertainty (Chap. 13, 14)

Course Slides

Part 2. Problems:

(Note: Please include any external reference materials other than the textbook. Use the APA format where appropriate.)

Problem 2.1: Decision Tree

For this question you need to refer to the decision tree section in the Course Slides (Module 2-2) posted on Blackboard.

One major issue for any decision tree algorithm is how to choose an attribute based on which the data set can be categorized and a well-balanced tree can be created. The most traditional approach is called the ID3 algorithm proposed by Quinlan in 1986. The detailed ID3 algorithm is shown in the slides. The textbook provides some discussions on the algorithm in Section 18.3. For this problem please follow the ID3 algorithm and manually calculate the values based on a data set similar to (but not the same as) the one in the course slides. This exercise should help you get deep insights on the execution of the ID3 algorithm. Please note that concepts discussed here (for example, entropy, information gain) are very important in information theory and signal processing fields. The new data set is shown as follows. In this example one row is removed from the original set and all other rows remain the same.

Following the conventions used in the slides, please show a manual process and calculate the following values: Entropy(S),Entropy (Sweather=sunny),Entropy(S weather=windy),Entropy(Sweather=rainy),

Gain (S, weather), Gain (S, parents) and Gain (S, money). Based on the last three values, which attribute should be chosen to split on?

Please show detailed process how you obtain the solutions.

Weekend Weather Parents Money Decision (Category)
W 1 Sunny Yes Rich Cinema
W2 Sunny No Rich Tennis
W3 windy Yes Rich Cinema
W4 rainy Yes Poor Cinema
W5 rainy No Rich Stay in
W6 rainy Yes Poor Cinema
W7 Windy No Poor Cinema
W8 Windy No Rich Shopping
W9 Windy Yes Rich Cinema

Problem 2.2: Decision Tree

The Decision Tree inductive learning algorithm may be used to generate "IF ... THEN" rules that are consistent with a set of given examples. Consider an example where 10 binary input variables X1, X2, X10are used to classify a binary output variable (Y).

(i) At most how many examples do we need to exhaustively enumerate every possible combination of inputs?

(ii) At most how many leaf nodes can a decision tree have if it is consistent with a training set containing 100 examples?

Please show detailed process how you obtain the solutions.

Problem 2.3: Bayes Theorem

A quality control manager has used algorithm C4.5 to come up with rules that classify items based on several input factors. The output has two classes -- Accept and Reject. Test results with the rule set indicate that 5% of the good items are classified as Reject and 2% of the bad items classified as Accept.

Historical data suggests that two percent of the items are bad. Based on this information, what is the conditional probability that:

(i) An item classified as Reject is actually good?

(ii) An item classified as Accept is actually bad?

Please show detailed process how you obtain the solutions.

Problem 2.4 Support Vector Machine

Consider the following set of training data.

x1 x2 class
1 1 +
2 2 +
2 0 +
3 1 +
0 0 -
1 0 -
0 1 -
-1 1 -

(i) Plot these six training points in a two-dimensional space (with x1 and x2).

Are the classes {+, -} linearly separable? Why?

(ii) Construct the weight vector of the maximum margin hyperplane by inspection and identify the support vectors.

(iii) If you remove one of the support vectors, does the size of the optimal margin decrease, stay the same, or increase? Justify your answer.

(iv) Is your answer to (iii) also true for any dataset in a 2-dimentioanl space? Provide a counterexample if it is not true, or give a short proof if it is true. When we have another dataset in a space with more than two dimensions, do you have the same answer? Justify.

Verified Expert

Reference no: EM13856373

Questions Cloud

Performance in relation to our competitors : Your boss, the CEO, asks you to analyze our company's performance in relation to our competitors, but she only gives you a short time frame for the project. You can do this either by comparing the firms' balance sheets and income statements or by com..
What customer competitive imperatives could be affected : With retailers as their primary customers, illustrate what customer competitive imperatives could be affected by Rollerblades inventory problems?
The planet venus is closer to the sun than earth : The planet Venus is closer to the sun than Earth and has a thick atmosphere of mostly CO2. Because of the greenhouse gas effect, the average surface temperature is about 475 degrees. What would the average temperature be if there were no atmosphere? ..
What factors lead us to conform and become obedient? : What factors lead us to conform and become obedient?
Plot six training points in a two-dimensional space : how many examples do we need to exhaustively enumerate every possible combination of inputs -  what is the conditional probability that An item classified as Reject is actually good?
Explain organization''s information security department : You recently accepted this job and have completed your first 3 months in the position. There are many security concerns, and the environment lacks policies and standards. You would like to address this, but you must first research the standard app..
Valid defense to the charge of smuggling : Dan was suspected by customs and immigration officers of having information concerning the smuggling of drugs into the United States. Acting undercover, a customs and immigration official went to Dan and suggested that he bring illegal drugs into thi..
How can managers break down silos in organizations : Silos are cohesive divisions or units in organizations that don’t work well with other units and undermine the success of organizations. Silos can be encouraged by managers that pit people or departments against each other. What is a silo and how do ..

Reviews

Write a Review

Data Structure & Algorithms Questions & Answers

  Data structures class

data structures class this project will give you an introduction. There are two important data structures that you will learn and use. The first is a stack, it is a LIFO (Last In First Out) structure. You can think of it like a a stack of plates in y..

  Data stewardship

Discussion: As more and more data are collected, stored, processed, and disseminated by organizations, new and innovative ways to manage them must be developed.

  Creating a database design in visio

Designing Databases with Visio Professional: A Tutorial," to help you complete Section 1: Visio Database Design. (Note: This tutorial focuses on the use of Microsoft Visio.

  Disadvantages for allocating the stack starting at prog

What are the advantages/disadvantages for allocating the stack starting at PROG. For step 1, does accessing the stack using index mode change the SP, What are some advantages/disadvantages for accessing the stack data this way

  A program that performs depth first search in a graph

a program that performs Depth First Search in a graph

  Write down a pseudocode version of the smart bubble sort

question a write a pseudocode version of the smart bubble sort algorithm.question b perform a smart bubble sort on the

  What is the time complexity of running quicksort

Consider your textbook's implementation of quicksort from chapter 8. The corrected findPartition method is included below for your convenience.

  Virtualization & memory

Evaluate the efficiency and reliability of both the most common nonpreemptive dispatch algorithms and the most common preemptive dispatch algorithms used for scheduling decisions. Provide one (1) example of the best use for each dispatch algorithm..

  Write algorithm to reverse elemens in queue

Using basic queue and stack operationns, write algorithm to reverse elemens in the queue. Suppose that 'Stack' is class described in section with 'StackType' set to int and STACK_CAPACITY

  Using channel to implement the back up

Think about an organization, which has a rented communications channel in two buildings, building A and building B. They have a set of servers in building A,

  Write the algorithm and find out the time complexity

Write the algorithm and find out the time complexity for the algorithm (in terms of n and m). Note that given two locations (x1, y1) and (x2, y2), distance between them can be calculated by the subsequent formula: ? (x2 - x1)2 + (y2 - y1)2.

  Steps of asymmetric encryption algorithms to read message

Using only asymmetric encryption algorithms write down any steps taken by Bob which permit him to read the message.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd