Create a decision tree which can predict class membership

Assignment Help Other Subject
Reference no: EM132265911

Data Science Assignments -

Assignment 1: Understanding Customer Churn at BondTelco

Congratulations, you are a newly employed data scientist at BondTelco. BondTelco is a retail provider of contract mobile phone services. Although quite a large company, BondTelco have not yet dabbled in Data Science, indeed, you are their first employee in this role.

Currently, management at BondTelco are concerned about the high rate of churn among their customers. To try and address this concern, the sales staff have been encouraged to ring up customers whose contracts are coming due, and offer them incentives to stay with the company. Unfortunately for management, this is an expensive process, as it involves offering incentives to all customers, whether they are likely to leave or not. What is really required is a way to predict whether a given customer is likely to leave or stay with the company. In this way, only customers who are likely to leave will be offered the incentives, thereby reducing costs.

The first question you have been asked to address is: 'Is there a way to determine in advance which customers are likely to leave when their contracts are up?'

The IT team provide you with access to the company database, which contains a table (BRUCEDBA.BondTelco_Customers) containing data on 20,000 previous customers, including whether they left or stayed with the company at the end of their contract period. The data in this table are:

COLLEGE : Is the customer college educated?

INCOME : Annual income

OVERAGE : Average overcharges per month

LEFTOVER : Average % leftover minutes per month

HOUSE : Value of dwelling (from census tract)

HANDSET_PRICE : Cost of phone

OVER_15MINS_CALLS_PER_MONTH : Average number of long (>15 mins) calls per month

AVERAGE_CALL_DURATION : Average call duration

REPORTED_SATISFACTION : Reported level of satisfaction

REPORTED_USAGE_LEVEL : Self-reported usage level

CONSIDERING_CHANGE_OF_PLAN : Was customer considering changing his/her plan?

LEAVE : Whether customer left or stayed

Your goal is to create a decision tree which can predict class membership of the LEAVE variable. The sales team will then use your tree's rules to determine whether a customer is likely to stay with or leave the company. Clearly, the better the tree prediction, the more retention costs can be reduced.

Assignment 2: Predicting the probability a Customer churns at BondTelco

Your previous work on Decision Trees impressed the management at BondTelco. Now that the management are talking an interest in Data Science, they have heard that there is a method of directly estimating the probability that a specific customer will churn.

The decision tree you created was certainly useful, but management would like to go a step further. Instead of just offering all customers who are likely to churn an incentive to stay, they would like to tier the levels of incentive. What is needed is an estimate of the probability that a given customer will leave.

Although they definitely don't want to interfere with your work, management have also heard about ideas like splitting the data into training and testing sets, and cross validation. They don't really know much about different techniques for doing this, and suggest that you should look into it, and see if it is a good approach for this particular problem.

You have access to the same data as Assignment 1.

Your goal is to create the best model you can to predict the probability a customer will LEAVE. You will need to be able to assess the models ability to predict on unseen data. The sales team will then use this model, and thus your probability estimate, to decide the kind of incentive the customer should be offered. Clearly, the better the prediction, the more the spending on retention costs can be optimized.

Assignment 3: Will a given customer churn at BondTelco?

Although your Decision Trees and Logistic Regression models have been making news around the company, as a solid Data Scientist, you know that to come up with a good model, you need to try a number of statistical learning techniques.

Your next job is to use knn classification to determine the probability that a given customer will churn.

You have access to the same data as Assignment 1.

Your goal is to create the best model you can to predict the probability a customer will LEAVE. You will need to be able to assess the models ability to predict on unseen data.

Attachment:- Assignment Files.rar

Reference no: EM132265911

Questions Cloud

The relationship between technology and criminal justice : Smartphone's, the Internet, and rapid advancement of technology have come to influence all aspects of people's daily lives.
Examine how would you handle the situation : You have been dispatched to assist the Sheriff's Department with a reporting shooting. You are told to stage two blocks west of the incident.
Inverse relationship between bonds : Thus, the required return is lowest for AAA-rated bonds, and required returns increase as the ratings get lower. Is this statement true or false?
What are the strengths and weaknesses in your writing : You have accomplished quite a bit this session. You have just about satisfied the writing requirement for this class in discussion posts, essays, and of course.
Create a decision tree which can predict class membership : INFT12-216: Data Science Assignment, Bond University, Australia. Create a decision tree which can predict class membership of the LEAVE variable
Investment in justus corporation : You are considering an investment in Justus Corporation's stock, which is expected to pay a dividend of $2.00 a share at the end of the year (D1 = $2.00)
Calculate the cost of equity for a corporation using capm : Calculate the cost of equity for a corporation using CAPM. Remember, if you are provided with a 6-month Treasury bill rate and a 20year Treasury bond rate
Disucss the properties of contemporary leadership model : How would you apply at least one contemporary leadership model from this chapter to a real (or hypothesized) health leadership situation or case?
Calculate the net present value of the new machine : The old equipment has a zero salvage value. Use straight line depreciation to calculate the depreciation. Calculate the net present value of the new machine?

Reviews

len2265911

3/25/2019 4:03:23 AM

Assignment 1: Deliverables: Your final deliverables will be 2 PDF files, both produced by the same .Rmd script (with different code chunk options). You must submit: PDF1 - all code and results shown (like you would share with a colleague on the Data Science team) PDF2 - only show those things necessary to help support management decision making (this is the one you send to management!) These will both be submitted online through iLearn. Note: Everything in R takes much longer than you think it will! My advice is to start promptly! I cannot give you an extension because you didn’t start early enough… its not fair to other students!

len2265911

3/25/2019 4:03:17 AM

Assignment 2: Deliverables: Your final deliverables will be 2 PDF files, both produced by the same .Rmd script (with different code chunk options). You must submit: PDF1 - all code and results shown (like you would share with a colleague on the Data Science team) PDF2 - only show those things necessary to help support management decision making (this is the one you send to management!) These will both be submitted online through iLearn. Document - Focus on a good document structure and layout (revisit week 2 on repeatable research if necessary) Hint: Think about the headings in the document you produce.

len2265911

3/25/2019 4:03:10 AM

Assignment 3: Deliverables: Your final deliverables will be 2 PDF files, both produced by the same .Rmd script (with different code chunk options). You must submit: PDF1 - all code and results shown (like you would share with a colleague on the Data Science team) PDF2 - only show those things necessary to help support management decision making (this is the one you send to management!) These will both be submitted online through iLearn. Focus on letting the visualizations do the talking. Only include explanatory text where it is really necessary… although you should remember that management do not really understand data science, so you will need to find a tradeoff between understandability and verbosity. Verbose assignments will be penalized. You will need to use visualizations. You will need to be able to show how good your model(s) is/are at classifying. You will need to show some example predictions from new data.

len2265911

3/25/2019 4:03:05 AM

Note: As is the case with all assignments I set, if you do the minimum (correctly), then you will receive half marks. Additional marks are awarded for those assignments where you have clearly put in additional thought, whether it be in visualization, modelling, succinctness, or coding elegance.

Write a Review

Other Subject Questions & Answers

  Cross-cultural opportunities and conflicts in canada

Short Paper on Cross-cultural Opportunities and Conflicts in Canada.

  Sociology theory questions

Sociology are very fundamental in nature. Role strain and role constraint speak about the duties and responsibilities of the roles of people in society or in a group. A short theory about Darwin and Moths is also answered.

  A book review on unfaithful angels

This review will help the reader understand the social work profession through different concepts giving the glimpse of why the social work profession might have drifted away from its original purpose of serving the poor.

  Disorder paper: schizophrenia

Schizophrenia does not really have just one single cause. It is a possibility that this disorder could be inherited but not all doctors are sure.

  Individual assignment: two models handout and rubric

Individual Assignment : Two Models Handout and Rubric,    This paper will allow you to understand and evaluate two vastly different organizational models and to effectively communicate their differences.

  Developing strategic intent for toyota

The following report includes the description about the organization, its strategies, industry analysis in which it operates and its position in the industry.

  Gasoline powered passenger vehicles

In this study, we examine how gasoline price volatility and income of the consumers impacts consumer's demand for gasoline.

  An aspect of poverty in canada

Economics thesis undergrad 4th year paper to write. it should be about 22 pages in length, literature review, economic analysis and then data or cost benefit analysis.

  Ngn customer satisfaction qos indicator for 3g services

The paper aims to highlight the global trends in countries and regions where 3G has already been introduced and propose an implementation plan to the telecom operators of developing countries.

  Prepare a power point presentation

Prepare the power point presentation for the case: Santa Fe Independent School District

  Information literacy is important in this environment

Information literacy is critically important in this contemporary environment

  Associative property of multiplication

Write a definition for associative property of multiplication.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd