Construct the weight vector of the maximum margin hyperplane

Assignment Help Data Structure & Algorithms
Reference no: EM13851882

Part 1. Text Reading:

Decision Trees Reasoning with uncertainty Course Slides

Part 2. Problems:

(Note: Please include any external reference materials other than the textbook. Use the APA format where appropriate.)

Problem 2.1: Decision Tree

For this question you need to refer to the decision tree section in the Course Slides (Module 2-2) posted on Blackboard.

One major issue for any decision tree algorithm is how to choose an attribute based on which the data set can be categorized and a well-balanced tree can be created. The most traditional approach is called the ID3 algorithm proposed by Quinlan in 1986. The detailed ID3 algorithm is shown in the slides. The textbook provides some discussions on the algorithm in Section 18.3.

For this problem please follow the ID3 algorithm and manually calculate the values based on a data set similar to (but not the same as) the one in the course slides. This exercise should help you get deep insights on the execution of the ID3 algorithm. Please note that concepts discussed here (for example, entropy, information gain) are very important in information theory and signal processing fields. The new data set is shown as follows. In this example one row is removed from the original set and all other rows remain the same.

Following the conventions used in the slides, please show a manual process and calculate the

following values: Entropy(Sweather=sunny), Entropy (Sweather = windy), Entropy(Sweather = rainy)

Gain (S, weather), Gain (S, parents) and Gain (S, money). Based on the last three values, which attribute should be chosen to split on?

Please show detailed process how you obtain the solutions.

Weekend  Weather  Parents  Money  Decision (Category)  
W1  Sunny  Yes  Rich  Cinema 
W2  Sunny  No  Rich  Tennis 
W3  Windy  Yes  Rich  Cinema 
W4  Rainy  Yes  Poor  Cinema 
W5  Rainy  No  Rich  Stay in 
W6  Rainy  Yes  Poor  Cinema 
W7  Windy  No  Poor  Cinema 
W8  Windy  No  Rich  Shopping 
W9  Windy  Yes  Rich  Cinema 

Problem 2.2:

The Decision Tree inductive learning algorithm may be used to generate "IF ... THEN" rules that are consistent with a set of given examples. Consider an example where 10 binary input variables X1, X2, , X10 are used to classify a binary output variable (Y).

(i) At most how many examples do we need to exhaustively enumerate every possible combination of inputs?

(ii) At most how many leaf nodes can a decision tree have if it is consistent with a training set containing 100 examples?

Please show detailed process how you obtain the solutions.

Problem 2.3. Bayes Theorem

A quality control manager has used algorithm C4.5 to come up with rules that classify items based on several input factors. The output has two classes -- Accept and Reject. Test results with the rule set indicate that 5% of the good items are classified as Reject and 2% of the bad items classified as Accept.

Historical data suggests that two percent of the items are bad. Based on this information, what is the conditional probability that:

(i) An item classified as Reject is actually good?

(ii) An item classified as Accept is actually bad?

Please show detailed process how you obtain the solutions.

Problem 2.4: Support Vector Machine

Consider the following set of training data.

x1  x2  class 
1 1
2 2
2 0
3 1
0 0
1 0
0 1
-1 1

(i) Plot these six training points in a two-dimensional space (with x1 and x2). Are the classes {+, -} linearly separable? Why?

(ii) Construct the weight vector of the maximum margin hyperplane by inspection and identify the support vectors.

(iii) If you remove one of the support vectors, does the size of the optimal margin decrease, stay the same, or increase? Justify your answer.

(iv) Is your answer to (iii) also true for any dataset in a 2-dimentioanl space? Provide a counterexample if it is not true, or give a short proof if it is true. When we have another dataset in a space with more than two dimensions, do you have the same answer? Justify.

Reference no: EM13851882

Previous Q& A

  What will be amount of interest accumulated at the time

Emily Dorsey's current salary is $85,000 per year, and she is planning to retire 19 years from now. She anticipates that her annual salary will increase by $1,000 each year ($85,000 the first year, to $86,000 the second year, $87,000 the third year, ..

  Annual heating and cooling cost of an office building

A “green” (environmentally friendly) office building costs as average of $3.50 per square foot each year to heat and cool. What is the total annual heating and cooling cost of an office building that has 10,000 square meters of space?

  Increase manufacturers costs of producing insulation

New safety regulations increase manufacturers’ costs of producing insulation. What happens in the market for insulation as a result?

  How describe an experiment about frequency of sound

How describe an experiment about "frequency of sound"

  Income doubles and prices stay unchanged

Max has the utility function U(x, y) = x(y + 1). The price of x is $2 and the price of y is $1. Max’s Income is $11. How much x does Max demand? How much y? If his income doubles and prices stay unchanged, will Max’s demand for both goods double?

  Edmund has the utility function

Edmund has the utility function U(x, y) = 2xy + 1. The prices of x and y are both $1 and Edmund has an income of $20. How much of each good will he demand? A tax is placed on x so that it now costs Edmund $2 while his income and the price of y stay t..

  Why janes division is having problems with its abm

Jane Erickson, manager of an electronics division, was not pleased with the results that had recently been reported concerning the division's activity-based management implementation project. Explain why Jane's division is having problems with its ..

  Performance review of the firm and external market factors

Marketing plan will cover all the goals and aspirations the Right Company has from Periods three to six as well as how we plan to make these aspirations a reality. Diagnosis Section- Performance Review of the firm and external market factors

  What is your gain or loss

Suppose that you are a borrower with a project that has a rate of return of 6.8%. You submit a bid to borrow $1,000 at an interest rate of 5%, and a lender accepts your offer. After you fund your project and pay back your loan, what is your gain or l..

  Estimate the cost today if the cost capacity factor

A heat exchanger cost $7500 in 2005 and must be replaced soon with a larger unit. The present unit has an effective area of 250 feet and its replacement should have an area of 350 feet. Replacement is anticipated in 2015 when the price index is estim..


Write a Review


Similar Q& A

  Create the entity relationship diagram

Create the entity relationship diagram for your project database based on the initial data requirements.

  Develop a flowchart associated with an hiim department

Develop a flowchart associated with an HIIM Department

  Design and write the client and server programs

Each client requests multiple CPU and I/O bursts from the keyboard. This information and the private FIFO are sent to the server through a common FIFO. The server responds to each client using private FIFOs.

  Explain how to determine line in o-n lg n time

Explain how to determine such a line in O(n lg n) time. Provide the O(n^2 lg n)-time algorithm to pair Ghostbusters with ghosts in such a way that no streams cross.

  Prepare a flowchart chart to print the largest number

Write a flow chart to print the largest of any three numbers - Prepare a flowchart chart to print the largest number.

  Using command line options in bash shell script

Design a script that will permit the user to enter one of several choices from the command line. The specific requirements are as follows:

  Systems analysis and design

What are the benefits of a thorough system requirements document? Drawbacks? How can a system requirements document be used to manage stakeholder expectations?

  Creating a database design in visio-business rules

Suppose a local college has tasked you to develop a database that will keep track of students and the courses that they have taken. In addition to tracking the students and courses, the client wants the database to keep track of the instructors te..

  Determine the branching factor

Expalin the search algorithm that results from each of the following special cases. How does it relate to other algorithms we have discussed.

  Write a c program to find the intersection andor union of

write a c program to find the intersection andor union of two doubly linked lists using recursion. you are not allowed

  Finding majority element

Let A be an array of n elements. An element x is said to be a majority element in A if it occurs in A more than n/2 times.

  Selection sort algorithm

Given the algorithm below for SelectionSort, trace the function by specifying the state of the input sequence after each call to swap()

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd