What do the results mean for each decision tree

Assignment Help Computer Engineering

Reference no: EM132310841

Assignment

Learning outcomes assessed:

1. Research and analyse the general nature of artificial intelligence and the problems it solves.

2. Objectively compare the strengths and limitations of various artificial intelligence techniques.

3. Apply and evaluate artificial intelligence techniques for solving a variety of real world problems.

Part 1: Decision Tree Learning

The dataset to be used for this section is: white-clover.arff

It describes a set of 63 measurements taken in farm paddocks from 1991-1994. They mainly consist of the % coverage of each plant in that paddock as well as where the measurements were taken. The dataset is a bit strange in that measurements from multiple years are contained in each data row. The class variable is the % of white clover found in 1994. This is supervised learning, so we are given the answer (the class variable) and can easily evaluate the success of our decision tree.

Your task is as follows:

A. Analysing the Data

Open the data file in a text editor and read the comments to understand the dataset.

Experiment with training several decision trees (start with J48 if unsure) with a variety of settings to see how well you can do at predicting the class variable on this dataset.

NB: Be careful of over-fitting. Think about how you might avoid overfitting the data.

B. Describing the Method

Choose TWO of the decision trees you experimented with.

Describe the method you used to analyse the data, making sure to include the following:

1. What are these decision trees attempting to learn?

2. What settings did you find worked well and why do you think this was? (for each tree)

3. What settings did you find did not work well and why do you think this was? (for each tree)

4. What did you do to avoid overfitting the data?

5. Include a screenshot of your settings pages so I can replicate you results.

Don’t forget to include your Test options!

C. Discussing the Results

You should have generated TWO sets of results in part A, one for each algorithm.

You should evaluate and discuss these results and should include at a minimum:

1. A screenshot of your results for each tree.

2. What do the results mean for each decision tree? What facts can you deduce from this result?

3. Which were the most important features in predicting the class variable? How did you decide this?

4. A comparison: list the main differences between each algorithm in a table.

D. Coming to a Conclusion

Finally, describe any conclusions you think you can draw from these results, including:

1. Which tree was better? How did you make this decision?

2. In your opinion, is the result of the best performing tree practically useful? Why?

3. Anything else interesting you found during your experimentation.

Part 2: Clustering

The dataset is available on Moodle: pasture.arff

This dataset contains a range of measurements from agricultural pastures. There are a wide variety of measurements, including number of earth worm species, fertiliser & rainfall.

The class variable is a simple Lo/Med/Hi rating for how productive that pasture was.

Your task is as follows:

A. Analysing the Data

Open the data file in a text editor and read the comments to understand the dataset.

Experiment with training several clustering algorithms (start with SimpleKMeans & EM) with a variety of settings to see how well you can do at predicting the class variable on this dataset.

You may want to start by working out how many clusters we are looking for. You can also experiment with ignoring various attributes to see what effect that has.

B. Describing the Method

Choose TWO of the clustering algorithms you experimented with.

Describe the method you used to analyse the data, making sure to include the following:

1. What are these algorithms attempting to learn?

2. What settings did you find worked well and why do you think this was? (for each)

3. What settings did you find did not work well and why do you think this was? (for each)

4. Include a screenshot of your settings pages so I can replicate you results.

Don’t forget to include your list of ignored Attributes. (if you did this)

C. Discussing the Results

You should have generated TWO sets of results in part A, one for each algorithm.

You should evaluate and discuss these results and should include at a minimum:

1. A screenshot of your results for each algorithm.

2. What do the results mean for each algorithm? What facts can you deduce from this result?

3. A comparison: list the main differences between each algorithm in a table.

D. Coming to a Conclusion

Finally, describe any conclusions you think you can draw from these results, including:

1. Which algorithm was better? How did you make this decision?

2. In your opinion, is the result of the best performing algorithm practically useful? Why?

3. Anything else interesting you found during your experimentation.

Part 3: Bayesian Learning

The dataset is available on Moodle: squash-stored.arff

This dataset contains a range of measurements taken from squash fruit during maturation, ripening & storage. This dataset has excellent descriptions of its attributes.

The class variable is a simple measure (3 possible values) of the quality of the fruit on arrival in Japan.

Your task is as follows:

A. Analysing the Data

Open the data file in a text editor and read the comments to understand the dataset.

Experiment with training several Bayes classifiers (start with Naïve Bayes) with a variety of settings to see how well you can do at predicting the class variable on this dataset.

NB: Be careful of over-fitting. Think about how you might avoid overfitting the data.

B. Describing the Method

Choose TWO of the Bayes classifiers you experimented with.

Describe the method you used to analyse the data, making sure to include the following:

1. What are these classifiers attempting to learn?

2. What settings did you find worked well and why do you think this was? (for each)

3. What settings did you find did not work well and why do you think this was? (for each)

4. What did you do to avoid overfitting the data?

5. Include a screenshot of your settings pages so I can replicate you results.

Don’t forget to include your Test options!

C. Discussing the Results

You should have generated TWO sets of results in part A, one for each algorithm.

You should evaluate and discuss these results and should include at a minimum:

1. A screenshot of your results for each algorithm.

2. What do the results mean for each algorithm? What facts can you deduce from this result?

3. A comparison: list the main differences between each algorithm in a table.

D. Coming to a Conclusion

Finally, describe any conclusions you think you can draw from these results, including:

1. Which algorithm was better? How did you make this decision?

2. In your opinion, is the result of the best performing algorithm practically useful? Why (not)?

3. Anything else interesting you found during your experimentation.

Part 4: Multi-Layer Perceptron

The datasets are the same as we have used in the previous examples and are available on Moodle:

quash-stored.arff
pasture.arff
white-clover.arff

Your task is as follows:

A. Analysing the Data

Experiment with training a Multi-Layer Perceptron (there is only one) with a variety of settings to see how well you can do at predicting the class variable on each dataset. Make sure to keep track of the best settings for each dataset.

NB: Be careful of over-fitting. Think about how you might avoid overfitting the data.

B. Describing the Method

Describe the method you used to analyse the data, making sure to include the following:

1. What are the perceptrons attempting to learn?

2. What settings did you find worked well and why do you think this was?

3. What settings did you find did not work well and why do you think this was?

4. What did you do to avoid overfitting the data?

5. Include a screenshot of your settings pages so I can replicate you results. (for each dataset.)

Don’t forget to include your Test options!

C. Discussing the Results

You should have generated THREE sets of results in part A, one for each data set.

You should evaluate and discuss these results and should include at a minimum:

1. A screenshot of your results for each dataset.

2. What do the results mean for each dataset? What facts can you deduce from this result?

3. Compare (in a table format) each of the three perceptron results with the best result obtained for that dataset in Parts A, B & C.

D. Coming to a Conclusion

Finally, describe any conclusions you think you can draw from these results, including:

1. In your opinion, is the perceptron result for each of the datasets practically useful? Why (not)?

2. Which algorithm was better for each of the three datasets? How did you make this decision?

3. Anything else interesting you found during your experimentation.

Part 5: Document Presentation

Your report should be presented professionally. This means:

Using an appropriate cover page listing your student details.

Having a table of contents page.

All images are appropriately captioned and resized without warping the original image.

Tables are used where appropriate and cleanly and clearly laid out.

Consistent use of styles throughout the entire document. (hint: use Word’s built in styles)

APA referencing used where appropriate including in-text references.

Appropriate language use.

Reference no: EM132310841

Questions Cloud

Analyse financial projections and perform financial analysis : PAQM321 Project and Quality Management Assignment - Project Plan, Kent Institute Australia. Analyse financial projections and perform financial analysis

Develop strategic recommendation based on consumer behaviour : Time-Based Assignment - Question One: Developing strategic recommendations based on consumer behaviour theories

Conduct independent investigation into related topics : encourage students to conduct independent investigation into related topics from books, the Internet, and through practical investigation

Brief historical background of LTE : ME602 Mobile and Satellite Communication Systems Assignment - Learning the basics of the LTE and 4G networks, Melbourne Institute of Technology, Australia

What do the results mean for each decision tree : COMP.7212 Artificial Intelligence Techniques-Toi Ohomai Institute of Technology New Zealand-What do the results mean for each decision tree?

Measurement problems and disadvantages : Explain Consumer price index (CPI) measurement problems and disadvantages?

Compare and contrast the dsm iv-tr : Compare and contrast the DSM IV-TR (2000) and DSM-V (2013) criteria for ASD. What are the similarities and the differences?

Develop components of the software specification : MITS5501 Software Quality, Change Management and Testing - Victorian institute of technology - Case Study Assignment - develop components of the Software

Perform bolted three-phase balanced fault analysis : ELEC4300/7303: Power Systems Analysis-The University of Queensland Australia-Perform bolted three-phase balanced fault analysis for the slack bus.

Reviews

len2310841

5/23/2019 4:34:46 AM

Research and analyse the general nature of artificial intelligence and the problems it solves.2. Objectively compare the strengths and limitations of various artificial intelligence techniques. 3. Apply and evaluate artificial intelligence techniques for solving a variety of real world problems. Your report should be presented professionally. This means:Using an appropriate cover page listing your student details. Having a table of contents page.All images are appropriately aptioned and resized without warping the original image.Tables are used where appropriate and cleanly and clearly laid out.Consistent use of styles throughout the entire document. (hint: use Word’s built in styles)APA referencing used where appropriate including in-text references. Appropriate language use.

Write a Review

Required(*) Message

User Account

All Pages