### Build a classification tree model

Assignment Help Basic Statistics
##### Reference no: EM13831777

Classification Tree (6%) Plumbing Inc. has been selling plumbing supplies for the last 20 years. The owner, Joe, decides that it is time to diversify by adding gardening tools to the products.

Having had success using customer data to build predictive models to guide direct mail campaigns for special plumbing offers, he considers that data mining could help him to identify a subset of customers who should be good prospects for his new set of products.

Is Joe ready to solve this as a supervised learning problem?

Explain your reasoning. If yes - what would you suggest as the target variable?

If no - why not? Information Gain Consider the example on page 23 in the lecture note for Session 3 (the same figure is on page 49 in the textbook).

We calculated information gain from splitting the population based on Body Shape. The procedure is demonstrated in the lecture note and in the textbook. As an assignment, you will calculate the information gain from Body Color.

You already know the entropy of the population. Split the set on the values of Body Color and calculate the entropy of each subset. Using the entropy values of population and children (subsets), calculate the information gain from Body Color.

QWE case - Classification Tree Read the case document.

1) What is a target variable? In one or two sentences, please provide a definition for the target variable. Be as precise as possible.

2) Build a classification tree model using rpart for the problem you suggested in 1). Include all other variables in the dataset as explanatory variables in the model. Set cp=0.004. Plot a classification tree. Save and submit the image.

3) How will you interpret the model result? Describe your observation.

#### Questions Cloud

 Logical structure of active directory : How would you design the logical structure of Active Directory for the Rough Country Miles of Alaska, and what domain naming structure would you suggest? Calculate the total dollar amount of supervisions cost : Calculate the total dollar amount of Supervisions cost that would have been allocated to the Shapez product line, if activity-based costing had been used in the last quarter - Calculate the activity-cost driver rate for the Supervisions activity. What are some techniques that the writers and producers use : What are some techniques that the writers and producers use to "sell" their product? - Be specific and detailed as you describe examples from the video A dividend-paying stock : 22. A broker is considering buying a dividend-paying stock. The dividend will be paid at the end of the year. The analyst consensus is the stock will be worth \$36 in one year. The company pays a \$2.25 annual dividend (ex-dividend date is not a consid.. Build a classification tree model : What is a target variable and Build a classification tree model using part for the problem What kind of business is batty chan and co : What kind of business is Batty, Chan & Co? Advise Wood Crafts Pty Ltd regarding their right to recover \$9500 for the furniture from Batty, Chan & Co. Is there a binding contract? If so who is liable? Case study of the premier products : Premier Products, Inc. manufactures tennis rackets. Premier Products has grown extensively over the past two years. While the company has been very profitable, President Mark Harrison is concerned with its ability to cost products accurately. What causes the change in bill depth between 1976 and 1978 : What causes the change in bill depth between 1976 and 1978 Explain what is meant by the multiplier : Explain what is meant by the multiplier and explain what variable determines its size. Prove through use of algebra that in a two sector economy saving must be equal to planned investment at the equilibrium level of output.

### Write a Review

#### One strategy for increased sat exam scores is to take the

one strategy for increased sat exam scores is to take the test study on weak areas then retake the test. based on the

#### In 2008 a university in the midwestern united states

in 2008 a university in the midwestern united states surveyed its full-time first-year students after they completed

#### Probability that cereal have high calorie and high fiber

What is the probability that a cereal would both high calorie and high fiber? In other words, what is P(high calorie and high fiber)?

#### Find average profits for order quantities of interest

Using a simulation of 25 random numbers. Find average profits for order quantities of interest. Investigate in intervals of 500.

#### Solving for distance

The function s(t) = ut + 1/2 gt^2 represents the distance traveled s meters in time t seconds when the initial velocity is u m/s and acceleration is g m/s^2.

#### According to experian the population of consumer credit

according to experian the population of consumer credit card holders carries an average credit card balance of 5500

#### Monte carlo sampling and its assumptions

Monte Carlo sampling makes the assumption that all parameters of an alternative are: A. Independent B. Dependent C. Equal to 1.0 D. Probability distribution

#### Test for differences in three treatments

For the preceding problem you should find that there are significant differences among the three treatments. One reason for the significance is that the sample variances are small.

#### Explain subscribers respond to the survey

On average, 8% of the subscribers respond to the survey. Of the 8% who respond, an average of 53% say they will renew. Sample and Population.

#### The diameter of bearing shafts that are manufactured in

1. the diameter of bearing shafts that are manufactured in your factory follow a normal distribution with a mean of 1

#### Do these results provide sufficient evidence assume a005

a multiple regression analysis has been done on 24 observations with the following partial anova table

#### The final score of games of a certain spot were compared

the final score of games of a certain spot were compared against the final point spreads established by oddsmakers. the