605-449 Introduction to Machine- Assignment Problem

Assignment Help Computer Engineering
Reference no: EM132390119

605.449 — Introduction to Machine Learning

Programming Project


The purpose of this assignment is to give you a chance to get some hands-on experience learning decision trees for classification and, for extra credit, regression. This time around, we are not going to use anything from the module on rule induction; however, you might want to examine the rules learned for your trees to see if they “make sense.”

Specifically, you will be implementing a standard univariate (i.e., axis-parallel) decision tree and will compare the performance of the trees when grown to completion on trees that use either early stopping (for regression trees) or reduced error pruning (for classification trees).

For decision trees, it should not matter whether you have categorical or numeric attributes, but you need to remember to keep track of which is which. In addition, you need to implement that gain-ratio criterion for splitting in your classification trees. For the regression trees, all of the attributes will be numeric.

For this assignment, you will use three classification datasets (plus three regression data sets for the extra credit) that you will download from the UCI Machine Learning Repository, namely:

1. Abalone — [Classification] Predicting the age of abalone from physical measurements.

2. Car Evaluation — [Classification] The data is on evaluations of car acceptability based on price, comfort, and technical specifications.

3. Image Segmentation — [Classification] The instances were drawn randomly from a database of 7 outdoor images. The images were hand segmented to create a classification for every pixel.

4. Computer Hardware — [Regression] The estimated relative performance values were estimated by the authors using a linear regression method. The gives you a chance to see how well you can replicate the results with these two models.

5. Forest Fires — [Regression] This is a difficult regression task, where the aim is to predict the burned area of forest fires, in the northeast region of Portugal, by using meteorological and other data .

6. Wine Quality — [Regression] This contains two data sets, one for red wine and one for white. Either combine the data sets into a single set for the regression task or build separate regression trees. This is your choice; however, we expect the separate trees to be better. The objective is to learn a model to assess the quality of wine.

For this project, the following steps are required:

• Download the six (6) data sets from the UCI Machine Learning repository.

• Implement the ID3 algorithm for classification decision trees using gain-ratio as the splitting criterion.

• Implement reduced-error pruning to be used as an option with your implementation of ID3.

• Run your ID3 implementation on each of the three classification data sets, comparing both pruned and unpruned versions of the trees.

These runs should be done with 5-fold cross-validation so you can compare your results statistically. You should pull out 10% of the data to be used as a validation set and then use the remaining 90% for cross validation. You should use classification error for your loss function.

• Write a very brief paper that incorporates the following elements, summarizing the results of your experiments.

1. Title and author name

2. A brief, one paragraph abstract summarizing the results of the experiments

3. Problem statement, including hypothesis, projecting how you expect each algorithm to perform

4. Brief description of algorithms implemented

5. Brief description of your experimental approach

6. Presentation of the results of your experiments

7. A discussion of the behavior of your algorithms, combined with any conclusions you can draw

8. Summary

9. References (you should have at least one reference related to each of the algorithms implemented, a reference to the data sources, and any other references you consider to be relevant)

Attachment:- Data.zip

Reference no: EM132390119

Questions Cloud

Best practices for incident response in the cloud : Discuss in 500 words or more the best practices for incident response in the cloud.
Identify a cardiac or respiratory issue : Identify a cardiac or respiratory issue and outline the key steps necessary to include for prevention and health promotion. The response must be typed.
What resources are often necessary for nonacute care : Discuss what resources are often necessary for nonacute care for cardiorespiratory issues. Explain how they support patient independence and decrease.
Explain role of the community health nurse in partnership : Explain the role of the community health nurse in partnership with community stakeholders for population health promotion. Explain why it is important.
605-449 Introduction to Machine- Assignment Problem : 605.449 — Introduction to Machine Learning Assignment Help and Solutions-Johns Hopkins University, USA- Compare the performance of the trees when grown.
Describe how the nursing process is utilized : Discuss how geopolitical and phenomenological place influence the context of a population or community assessment and intervention. Describe how the nursing.
What dba must be aware of to maintain good regulatory : What a dba must be aware of to maintain good regulatory compliance when moving to the cloud.
FAC101 Principles of Financial Accounting Assignment : FAC101 Principles of Financial Accounting Assignment Help and Solution - Khawarizmi International College, UAE - Prepare the Trial Balance at April 30, 2019
NUR 2488 Mental Health Nursing Question : NUR 2488 Mental Health Nursing assignment help and assessment help, Rasmussen College - "How the effects of an underlying (and often untreated), mental illness.

Reviews

len2390119

10/21/2019 2:05:16 AM

Title and author name 2. A brief, one paragraph abstract summarizing the results of the experiments 3. Problem statement, including hypothesis, projecting how you expect each algorithm to perform 4. Brief description of algorithms implemented 5. Brief description of your experimental approach 6. Presentation of the results of your experiments 7. A discussion of the behavior of your algorithms, combined with any conclusions you can draw 8. Summary 9. References (you should have at least one reference related to each of the algorithms implemented, a reference to the data sources, and any other references you consider to be relevant)

Write a Review

Computer Engineering Questions & Answers

  Explain how are the expectations of computer support

write a 200- to 300-word response defining the various external customers found within computer support. how are these

  Write down an equation and draw a circuit

offer below is a truth table for a combinational logic circuit with three inputs and one output. Write an equation and draw a circuit which implements the function represented by this table.

  Write an assembly language program to accept positive

Write a 68000 assembly language program to accept positive integer values M and N from the keyboard, compute the value of Y using the formula.

  Write a program that calculate taxi fare at a rate

Write a program that calculate taxi fare at a rate of $1.5 per mile. Your program should interact with the user in this manner.

  Explain what is sim phase

SIMcon is a forensics software tool that can generate information for mobile device investigations.

  Which one was your preferred productivity software

Now that you've worked in both Microsoft Office and G Suite, which one was your preferred productivity software? What are the strengths and disadvantages.

  Build an eportfolio page to describe your work

Apply business information software for data visualization and analysis purposes - you will complete the set of tasks using Excel, and build ePortfolio

  A digital representation of information oten includes a

a digital representation of information oten involves a tradeoff between the amount of storage required and the

  Implementing the python atm program

Write down a simple Python ATM program. Ask user to enter their account number, and then print their beginning balance. Then ask them if they wish to make a deposit or a withdrawal.

  Describe the different types of information systems

INFORMATION SYSTEMS-Level I Semester I-National Council for Higher Education- BACHELOR OF SCIENCE IN SOFTWARE ENGINEERING.

  Present managers of the economy are determined to ensure

by far the biggest development in the economy in recent years is the mining boom and its likely to roll on for at least

  Question1 add to situation calculus the ability to paint an

question1. add to situation calculus the ability to paint an object. in particular add the predicate colorobjcolsitthat

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd