What observations do you have on the two classifiers

Assignment Help Basic Computer Science
Reference no: EM132242097

Data Mining

Coursework

Suppose that the following table of instances (cases) were recorded for an insurance company's promotions for its life assurance product. The attributes are self-explanatory, and the values in the two product promotion attributes should be read as follows: a Yes means that the individual was offered that particular promotion only if s/he would take out the insurance and No not offered the promotion.

ID Income Range Gender Age Range  Holiday Promotion  Wine Promotion  Life Insurance Take Up
1 40-50K Male 30-40 No Yes Yes
2 30-40K Female 30-40 No Yes No
3 40-50K Male 30-40 No No No
4 30-40K Male 30-40 Yes Yes Yes
5 50-60K Female 20-30 No No No
6 20-30K Female 40-50 No No No
7 30-40K Male 20-30 Yes No No
8 20-30K Male 20-30 No Yes Yes
9 30-40K Male 30-40 No Yes Yes
10 30-40K Female 30-40 No No Yes
11 40-50K Female 30-40 No No No
12 20-30K Male 20-30 No Yes Yes
13 50-60K Female 20-30 No No No
14 40-50K Male 40-50 No Yes No
15 20-30K Female 20-30 Yes Yes No
16 40-50K Female 30-40 No No No
17 50-60K Male 40-50 Yes Yes Yes
18 20-30K Female 30-40 No Yes No
19 20-30K Male 40-50 Yes Yes Yes
20 30-40K Female 20-30 Yes Yes No

Questions:

Use the ID3 decision tree induction method available in the Weka package (with the default setting) to derive a classifier (decision tree) from this set of data. The class attribute is Life Assurance Take-up.

What should be the class value for the following unseen case based on the derived tree? Justify your answer.

Income Range Gender Age Range Holiday Promotion Wine Promotion Life Insurance Take-up
40-50K Male 20-30 No Yes ?

How would you deal with such cases in general? Outline your solution algorithmically using the structure given below:

algorithm DT-based Classification
# traversing the tree to reach a leaf node N

if N's class value is null then
:
: write your pseudo code to implement your solution here
:
else
return the class value
end

A decision tree derived from data can be used not only to predict class values for unseen cases, but also to summarize data for analysis. Based on the tree derived in 1), comment on whether the company has conducted its promotion effectively.

In the default setting in Weka, there is a setting of "Cross-Validation Folds 10" in the test options. Briefly explain how Cross Validation tests a model derived from training data and why we use it for testing.

Now perform the following tests: you vary "fold" from 2 to 10, run ID3 and observe classification accuracy for each setting. You then change the test options setting to "Use training set" and run ID3 and observe classification accuracy. You can record and present these test results as a table or a bar chart. Comment on your test results: which method (cross validation or using training set) is better for testing your derived tree and why?

Use the JRip rule induction method available in the Weka package (with the default setting) to derive a classifier (classification rules) from this set of data.

What observations do you have on the two classifiers you have obtained in terms of using them for business analysis (as in 3) and for classification of an unseen case (as in 2)?

Attachment:- data.rar

Verified Expert

In this assignment we have perform operation for id3 algorithm and we have studied how to add new packages.we have also studied different algorithms and decision tree algorithm for given dataset. Here we have also studied the 2 to 10 fold for cross validation and validation rate

Reference no: EM132242097

Questions Cloud

How do you know what the hot topics are : Throughout your studies, you have heard of the "big names" of the organizations that impact accounting. But what do they really do?
How the MATLAB work can be plugged into ROS : Evaluate the methods used for robotic arm movements and grasping of robotic hand. Explanation of how the MATLAB work can be plugged into ROS
Explain the purpose of a business plan : In this course you will develop a business plan for a community hospital that is considering the addition of an urgent care center. Each week you will complete.
Illustrate differences and success levels of your examples : Experimentation in business has an important role in an organization's strategic decision making. However, experimentation is not needed for all business.
What observations do you have on the two classifiers : What observations do you have on the two classifiers you have obtained in terms of using them for business analysis (as in 3) and for classification
What is the bias that would be used for the exponent : What is the bias that would be used for the exponent? What is the largest positive exponent? What is the most negative exponent
The holdens claimed release was contrary to public policy : Hypnotism shows had been held annually since 1980, and Sara had seen the previous year’s show. The Holdens claimed the release was contrary to public policy.
Define what you consider to be your strengths : Define what you consider to be your strengths as a communicator. Consider talking, listening, empathy, and so on. Discuss what you consider to be some important
Why do you think employees react in the given way : Why do you think employees react in this way? What can leaders do to avoid this skeptical reaction on the part of employees and ensure that the new vision.

Reviews

len2242097

2/25/2019 1:46:25 AM

Criteria for assessment Credit will be awarded against the following criteria. 1. The classifier derived using ID3 for Q1 [5 marks] 2. Convincing arguments and solution for Q2 [25 marks] 3. Valid analysis for Q3 [15 marks] 4. Clarity of explanation Q4 [20 marks] 5. Experiment results and analysis for Q5 [10 marks] 6. The classifier derived using JRip for Q6 [5 marks] 7. Clarity of your observations for Q7 [20 marks]

Write a Review

Basic Computer Science Questions & Answers

  Set the level of premiums for a cyber insurance

Thus, it is almost impossible for an insurance company to set the level of premiums for a cyber insurance policy. Do you agree?

  Give a definition for optimum performance

Give a definition for optimum performance.Would that definition change as you look for a personal computer at home versus helping your employer purchase a web server to accept customers' orders over the Internet.

  How was the term employment equity created and why

How was the term ‘employment equity' created and why? What is its core underlying principle of employment equity?

  Find on what day a specific date falls

1: Find on what day a specific date falls. 2: Display the twelve month calendar for a given year. Option: Depending on the given option, the program asks user to enter valid mm dd yyyy or just yyyy and call the necessary functions to print out the..

  Why do microprocessors such pagers have only eight-bit words

Why do microprocessors such as pagers have only 8-bit words? Why is important to study how to manipulate fixed-sized numbers?

  Hearing problems the community could experience

Residents are concerned with the possible hearing problems the community could experience.

  Describe in 200 to 300 words at least two different ways to

explain in 200 to 300 words at least two different ways to secure a wlan. what are the ramifications if a wlan is

  Building mobile applications

"What you see - is what you get, building mobile applications. Choose one and how you can incorporate it into your cloud project (you may implement).

  How would a user perceive the mouse motion

Suppose that mouse position changes are being sent over the connection. Assuming that multiple position changes are sent each RTT, how would a user perceive the mouse motion with and without the Nagle algorithm?

  Critical infrastructure protection

Critical Infrastructure ProtectionAccording to the text, Critical Infrastructure Protection (CIP) is an important cybersecurity initiative that requires careful planning and coordination in protecting our infrastructure.

  Network interface encapsulates an arp

How does a computer on a LAN know whether an ethernet fram it receives from its network interface encapsulates an ARP

  Find worst case run time of the recursive factorial function

Draw the recursive call tree for the printRev() function from Section 10.1 when called with a value of 5.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd