+1-415-670-9189
info@expertsmind.com
Plot six training points in a two-dimensional space
Course:- Data Structure & Algorithms
Reference No.:- EM13856373




Assignment Help
Expertsmind Rated 4.9 / 5 based on 47215 reviews.
Review Site
Assignment Help >> Data Structure & Algorithms

Note: Please include your name and the "Certification of Authorship" (located on Blackboard) form in EVERY document you submit. Thanks.

Part 1. Text Reading:

Decision Trees (Chap. 18 Sec 18.3), Reasoning with uncertainty (Chap. 13, 14)

Course Slides

Part 2. Problems:

(Note: Please include any external reference materials other than the textbook. Use the APA format where appropriate.)

Problem 2.1: Decision Tree

For this question you need to refer to the decision tree section in the Course Slides (Module 2-2) posted on Blackboard.

One major issue for any decision tree algorithm is how to choose an attribute based on which the data set can be categorized and a well-balanced tree can be created. The most traditional approach is called the ID3 algorithm proposed by Quinlan in 1986. The detailed ID3 algorithm is shown in the slides. The textbook provides some discussions on the algorithm in Section 18.3. For this problem please follow the ID3 algorithm and manually calculate the values based on a data set similar to (but not the same as) the one in the course slides. This exercise should help you get deep insights on the execution of the ID3 algorithm. Please note that concepts discussed here (for example, entropy, information gain) are very important in information theory and signal processing fields. The new data set is shown as follows. In this example one row is removed from the original set and all other rows remain the same.

Following the conventions used in the slides, please show a manual process and calculate the following values: Entropy(S),Entropy (Sweather=sunny),Entropy(S weather=windy),Entropy(Sweather=rainy),

Gain (S, weather), Gain (S, parents) and Gain (S, money). Based on the last three values, which attribute should be chosen to split on?

Please show detailed process how you obtain the solutions.

Weekend Weather Parents Money Decision (Category)
W 1 Sunny Yes Rich Cinema
W2 Sunny No Rich Tennis
W3 windy Yes Rich Cinema
W4 rainy Yes Poor Cinema
W5 rainy No Rich Stay in
W6 rainy Yes Poor Cinema
W7 Windy No Poor Cinema
W8 Windy No Rich Shopping
W9 Windy Yes Rich Cinema

Problem 2.2: Decision Tree

The Decision Tree inductive learning algorithm may be used to generate "IF ... THEN" rules that are consistent with a set of given examples. Consider an example where 10 binary input variables X1, X2, X10are used to classify a binary output variable (Y).

(i) At most how many examples do we need to exhaustively enumerate every possible combination of inputs?

(ii) At most how many leaf nodes can a decision tree have if it is consistent with a training set containing 100 examples?

Please show detailed process how you obtain the solutions.

Problem 2.3: Bayes Theorem

A quality control manager has used algorithm C4.5 to come up with rules that classify items based on several input factors. The output has two classes -- Accept and Reject. Test results with the rule set indicate that 5% of the good items are classified as Reject and 2% of the bad items classified as Accept.

Historical data suggests that two percent of the items are bad. Based on this information, what is the conditional probability that:

(i) An item classified as Reject is actually good?

(ii) An item classified as Accept is actually bad?

Please show detailed process how you obtain the solutions.

Problem 2.4 Support Vector Machine

Consider the following set of training data.

x1 x2 class
1 1 +
2 2 +
2 0 +
3 1 +
0 0 -
1 0 -
0 1 -
-1 1 -

(i) Plot these six training points in a two-dimensional space (with x1 and x2).

Are the classes {+, -} linearly separable? Why?

(ii) Construct the weight vector of the maximum margin hyperplane by inspection and identify the support vectors.

(iii) If you remove one of the support vectors, does the size of the optimal margin decrease, stay the same, or increase? Justify your answer.

(iv) Is your answer to (iii) also true for any dataset in a 2-dimentioanl space? Provide a counterexample if it is not true, or give a short proof if it is true. When we have another dataset in a space with more than two dimensions, do you have the same answer? Justify.

Answered:-

Verified Expert


Preview Container content

Decision Trees: (Chap. 18 Sec 18.3),

A decision tree is a method to take attribute value of vectoras input and it will return a decision of a single output value. The input and output of the value can be discrete and continuous. The positive value of the Boolean classification as true and negative value will be false. The sequence of the test can perform by the decision tree to reach it decision. The test value of the input attributes is named as Ai which is presented in the intern node of the decision tree which is corresponding to test value. The branches from the node are named with the possible values of the attribute is Ai = vik. The leaf node of the decision tree specifies the value will return by decision method.
Reasoning with uncertainty: (Chap. 13, 14)

The process of intelligent behavior requires the order of the uncertainty which is associated with it. The different type of uncertainty may occur in the knowledge based system which causes the problem in the data such as data missing or unavailability of data, unrealizable and inconsistent. An uncertainty handles by three ways. They are probabilistic reasoning, certainty factors and demstershafer theory.




Put your comment
 
Minimize


Ask Question & Get Answers from Experts
Browse some more (Data Structure & Algorithms) Materials
We say that a graph G = (V, E) is a triangulated cycle graph if it consists of the vertices and edges of a triangulated convex n-gon in the plane-in other words, if it can b
Suppose that it doesn't take any time to allot work to process, calculate best- and worst-case speedup for centralized scheme for dynamic mapping with two processes.
You are given a one dimensional array that may contain both positive and negative integers. Give an O(n log n) algorithm to find the sum of contiguous (ie. next to one anoth
Assume that a homogeneous array with six rows and eight columns, is stored in row major order starting at address 20. If each entry in the array requires only one memory cell.
Let Fi(x) = i * (1+log x). Describe a dynamic programming algorithm to input 2 integers x and m and determine how to break x into m integers x1, x2, ..., Xm such that f1(x1)
Find the average number of key comparisons in a successful search in the hash table. You can assume that a search for each of the nine keys is equally likely - What is the l
Implement the following algorithm for the evaluation of arithmetic expressions. Each operator has a precedence. The + and - operators have the lowest precedence.
Create the graphical representation of Newick format tree given below. Label each leaf with appropriate sequence identifier, and label each branch with its appropriate length.