What observations do you have on the two classifiers

Assignment Help Basic Computer Science
Reference no: EM132242097

Data Mining

Coursework

Suppose that the following table of instances (cases) were recorded for an insurance company's promotions for its life assurance product. The attributes are self-explanatory, and the values in the two product promotion attributes should be read as follows: a Yes means that the individual was offered that particular promotion only if s/he would take out the insurance and No not offered the promotion.

ID Income Range Gender Age Range  Holiday Promotion  Wine Promotion  Life Insurance Take Up
1 40-50K Male 30-40 No Yes Yes
2 30-40K Female 30-40 No Yes No
3 40-50K Male 30-40 No No No
4 30-40K Male 30-40 Yes Yes Yes
5 50-60K Female 20-30 No No No
6 20-30K Female 40-50 No No No
7 30-40K Male 20-30 Yes No No
8 20-30K Male 20-30 No Yes Yes
9 30-40K Male 30-40 No Yes Yes
10 30-40K Female 30-40 No No Yes
11 40-50K Female 30-40 No No No
12 20-30K Male 20-30 No Yes Yes
13 50-60K Female 20-30 No No No
14 40-50K Male 40-50 No Yes No
15 20-30K Female 20-30 Yes Yes No
16 40-50K Female 30-40 No No No
17 50-60K Male 40-50 Yes Yes Yes
18 20-30K Female 30-40 No Yes No
19 20-30K Male 40-50 Yes Yes Yes
20 30-40K Female 20-30 Yes Yes No

Questions:

Use the ID3 decision tree induction method available in the Weka package (with the default setting) to derive a classifier (decision tree) from this set of data. The class attribute is Life Assurance Take-up.

What should be the class value for the following unseen case based on the derived tree? Justify your answer.

Income Range Gender Age Range Holiday Promotion Wine Promotion Life Insurance Take-up
40-50K Male 20-30 No Yes ?

How would you deal with such cases in general? Outline your solution algorithmically using the structure given below:

algorithm DT-based Classification
# traversing the tree to reach a leaf node N

if N's class value is null then
:
: write your pseudo code to implement your solution here
:
else
return the class value
end

A decision tree derived from data can be used not only to predict class values for unseen cases, but also to summarize data for analysis. Based on the tree derived in 1), comment on whether the company has conducted its promotion effectively.

In the default setting in Weka, there is a setting of "Cross-Validation Folds 10" in the test options. Briefly explain how Cross Validation tests a model derived from training data and why we use it for testing.

Now perform the following tests: you vary "fold" from 2 to 10, run ID3 and observe classification accuracy for each setting. You then change the test options setting to "Use training set" and run ID3 and observe classification accuracy. You can record and present these test results as a table or a bar chart. Comment on your test results: which method (cross validation or using training set) is better for testing your derived tree and why?

Use the JRip rule induction method available in the Weka package (with the default setting) to derive a classifier (classification rules) from this set of data.

What observations do you have on the two classifiers you have obtained in terms of using them for business analysis (as in 3) and for classification of an unseen case (as in 2)?

Attachment:- data.rar

Verified Expert

In this assignment we have perform operation for id3 algorithm and we have studied how to add new packages.we have also studied different algorithms and decision tree algorithm for given dataset. Here we have also studied the 2 to 10 fold for cross validation and validation rate

Reference no: EM132242097

Questions Cloud

How do you know what the hot topics are : Throughout your studies, you have heard of the "big names" of the organizations that impact accounting. But what do they really do?
How the MATLAB work can be plugged into ROS : Evaluate the methods used for robotic arm movements and grasping of robotic hand. Explanation of how the MATLAB work can be plugged into ROS
Explain the purpose of a business plan : In this course you will develop a business plan for a community hospital that is considering the addition of an urgent care center. Each week you will complete.
Illustrate differences and success levels of your examples : Experimentation in business has an important role in an organization's strategic decision making. However, experimentation is not needed for all business.
What observations do you have on the two classifiers : What observations do you have on the two classifiers you have obtained in terms of using them for business analysis (as in 3) and for classification
What is the bias that would be used for the exponent : What is the bias that would be used for the exponent? What is the largest positive exponent? What is the most negative exponent
The holdens claimed release was contrary to public policy : Hypnotism shows had been held annually since 1980, and Sara had seen the previous year’s show. The Holdens claimed the release was contrary to public policy.
Define what you consider to be your strengths : Define what you consider to be your strengths as a communicator. Consider talking, listening, empathy, and so on. Discuss what you consider to be some important
Why do you think employees react in the given way : Why do you think employees react in this way? What can leaders do to avoid this skeptical reaction on the part of employees and ensure that the new vision.

Reviews

len2242097

2/25/2019 1:46:25 AM

Criteria for assessment Credit will be awarded against the following criteria. 1. The classifier derived using ID3 for Q1 [5 marks] 2. Convincing arguments and solution for Q2 [25 marks] 3. Valid analysis for Q3 [15 marks] 4. Clarity of explanation Q4 [20 marks] 5. Experiment results and analysis for Q5 [10 marks] 6. The classifier derived using JRip for Q6 [5 marks] 7. Clarity of your observations for Q7 [20 marks]

Write a Review

Basic Computer Science Questions & Answers

  Identifies the cost of computer

identifies the cost of computer components to configure a computer system (including all peripheral devices where needed) for use in one of the following four situations:

  Input devices

Compare how the gestures data is generated and represented for interpretation in each of the following input devices. In your comparison, consider the data formats (radio waves, electrical signal, sound, etc.), device drivers, operating systems suppo..

  Cores on computer systems

Assignment : Cores on Computer Systems:  Differentiate between multiprocessor systems and many-core systems in terms of power efficiency, cost benefit analysis, instructions processing efficiency, and packaging form factors.

  Prepare an annual budget in an excel spreadsheet

Prepare working solutions in Excel that will manage the annual budget

  Write a research paper in relation to a software design

Research paper in relation to a Software Design related topic

  Describe the forest, domain, ou, and trust configuration

Describe the forest, domain, OU, and trust configuration for Bluesky. Include a chart or diagram of the current configuration. Currently Bluesky has a single domain and default OU structure.

  Construct a truth table for the boolean expression

Construct a truth table for the Boolean expressions ABC + A'B'C' ABC + AB'C' + A'B'C' A(BC' + B'C)

  Evaluate the cost of materials

Evaluate the cost of materials

  The marie simulator

Depending on how comfortable you are with using the MARIE simulator after reading

  What is the main advantage of using master pages

What is the main advantage of using master pages. Explain the purpose and advantage of using styles.

  Describe the three fundamental models of distributed systems

Explain the two approaches to packet delivery by the network layer in Distributed Systems. Describe the three fundamental models of Distributed Systems

  Distinguish between caching and buffering

Distinguish between caching and buffering The failure model defines the ways in which failure may occur in order to provide an understanding of the effects of failure. Give one type of failure with a brief description of the failure

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd