Build a classification tree model

Assignment Help Basic Statistics
Reference no: EM13831777

Classification Tree (6%) Plumbing Inc. has been selling plumbing supplies for the last 20 years. The owner, Joe, decides that it is time to diversify by adding gardening tools to the products.

Having had success using customer data to build predictive models to guide direct mail campaigns for special plumbing offers, he considers that data mining could help him to identify a subset of customers who should be good prospects for his new set of products.

Is Joe ready to solve this as a supervised learning problem?

Explain your reasoning. If yes - what would you suggest as the target variable?

If no - why not? Information Gain Consider the example on page 23 in the lecture note for Session 3 (the same figure is on page 49 in the textbook).

We calculated information gain from splitting the population based on Body Shape. The procedure is demonstrated in the lecture note and in the textbook. As an assignment, you will calculate the information gain from Body Color.

You already know the entropy of the population. Split the set on the values of Body Color and calculate the entropy of each subset. Using the entropy values of population and children (subsets), calculate the information gain from Body Color.

QWE case - Classification Tree Read the case document.

1) What is a target variable? In one or two sentences, please provide a definition for the target variable. Be as precise as possible.

2) Build a classification tree model using rpart for the problem you suggested in 1). Include all other variables in the dataset as explanatory variables in the model. Set cp=0.004. Plot a classification tree. Save and submit the image.

3) How will you interpret the model result? Describe your observation.

Reference no: EM13831777

Questions Cloud

Logical structure of active directory : How would you design the logical structure of Active Directory for the Rough Country Miles of Alaska, and what domain naming structure would you suggest?
Calculate the total dollar amount of supervisions cost : Calculate the total dollar amount of Supervisions cost that would have been allocated to the Shapez product line, if activity-based costing had been used in the last quarter - Calculate the activity-cost driver rate for the Supervisions activity.
What are some techniques that the writers and producers use : What are some techniques that the writers and producers use to "sell" their product? - Be specific and detailed as you describe examples from the video
A dividend-paying stock : 22. A broker is considering buying a dividend-paying stock. The dividend will be paid at the end of the year. The analyst consensus is the stock will be worth $36 in one year. The company pays a $2.25 annual dividend (ex-dividend date is not a consid..
Build a classification tree model : What is a target variable and Build a classification tree model using part for the problem
What kind of business is batty chan and co : What kind of business is Batty, Chan & Co? Advise Wood Crafts Pty Ltd regarding their right to recover $9500 for the furniture from Batty, Chan & Co. Is there a binding contract? If so who is liable?
Case study of the premier products : Premier Products, Inc. manufactures tennis rackets. Premier Products has grown extensively over the past two years. While the company has been very profitable, President Mark Harrison is concerned with its ability to cost products accurately.
What causes the change in bill depth between 1976 and 1978 : What causes the change in bill depth between 1976 and 1978
Explain what is meant by the multiplier : Explain what is meant by the multiplier and explain what variable determines its size. Prove through use of algebra that in a two sector economy saving must be equal to planned investment at the equilibrium level of output.

Reviews

Write a Review

Basic Statistics Questions & Answers

  Three coins are tossed simultaneously

Three coins are tossed simultaneously, consider the event E "Three heads or three tails" F " At least two heads" and G "Atmost two heads " of the pairs (E,F), (E,G) and (F,G), which are independent? Which are dependent?

  What information would we need to test the claim

What information would we need to test the claim that the difference in monthly benefits between the two groups is greater than $30 at the 0.05 level of significance? Write out the hypotheses and explain the testing procedure.

  Difference in two population proportions and means

Which of the given signifies difference in two population proportions?

  Use scientific process for better understand personality

How have you used the scientific process (unsystematic observation, building theories, and evaluating propositions) in your life to better understand your personality?

  Explain these statistics results

Explain these results to a person who has never had a course in statistics - An experiment is conducted in which 60 participants each fill out a personality test,

  A skeptical paranormal researcher claims the proportion of

a skeptical paranormal researcher claims the proportion of americans that have seen a ufo p is less than 1 in every one

  An airline company sells 160 tickets for a plane with 150

an airline company sells 160 tickets for a plane with 150 seats knowing that the probability that a passenger will not

  A study of various news home pages reports that the mean

a study of various news home pages reports that the mean number of bad links per home page is 0.4. if the errors occur

  Suppose that the average life of a refrigerator before

suppose that the average life of a refrigerator before replacement is m 14 years with a 99.7 of data range from 8 to

  The operationalization of a construct means that there

the operationalization of a construct means that there must be looked for as many related dimensions and elements of

  Prediction interval for the regression lineusing the age

prediction interval for the regression line.using the age and sick days data from table below find the 98 prediction

  The mean and the standard deviation of the sample of 100

the mean and the standard deviation of the sample of 100 bank customer waiting times in table 1.8 are xbar5.49 and

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd