Estimate probability of default for a credit application

Assignment Help Programming Languages
Reference no: EM13498969

1) Label each case as describing either data mining (DM), or the use of the results of data mining (Use).
a) _____ Choose customers who are most likely to respond to an on-line ad.

b) _____ Discover rules that indicate when an account has been defrauded.

c) _____ Find patterns indicating what customer behavior is more likely to lead to response to an on-line ad.

d) _____ Estimate probability of default for a credit application.

e) _____ Predict whether a customer is pregnant

2) Plumbing Inc. has been selling plumbing supplies for the last 20 years. The owner, Joe, decides that next year it is finally time to diversify by adding gardening tools to his products. Having had success using customer data to build predictive models to guide direct mail campaigns for special plumbing offers, he considers that data mining could help him to identify a subset of customers who should be good prospects for his new set of products. Is Joe ready to solve this as a supervised learning problem?

If yes - what would you suggest as the target variable?


If no - why not? What would you recommend that Joe do to achieve his business goal?

3) Choose a problem from a past job, hobby, or interest that would make for a good predictive modeling classification application. Describe it in one page or less, using the relevant concepts introduced in classes 1 & 2 and Ch. 1 - 3 in the book. Your description should be as complete and precise as possible, referring to the concepts introduced in class/in the book. Please do not choose one of the applications we have discussed already (churn, targeted marketing, default prediction, pregnancy prediction).
Include answers to the following:

a) What exactly is the business decision you want to support with this solution? (Specifically, what is the business action you are considering? Discuss briefly the timing of the decision and the eventual outcome.)

b) Describe the use phase.

c) Why did you select this as a good predictive modeling problem?

d) How and where would you get the data?

e) Explain precisely why and how you expect doing the predictive modeling will add value.

f) What exactly is the quantity that you inherently do not know and need to predict?

g) Is this a classification, ranking, or probability estimation problem?

h) What are the features? Provide a list of at least 5 features that you think (a) you can get and (b) you think might be useful.

i) What exactly would be your training data?

4 Hands on (WEKA version). This is a first simple hands-on modeling task using Weka. Your task is to experiment with the classification tree induction algorithm in Weka. The data is available on NYU Classes in the data section under Resources->Datasets->Mailing (mailing_train.arff and mailing_test.arff). Build a classification tree using the J48 algorithm. If our classroom Weka demonstration was not enough, please consult the Weka tutorial (available under Resources->Weka >Weka_tutorial. It is useful to try to figure things out on your own, but if you get frustrated trying to figure out how to do something, please post a question to the discussion forum.

HINTS: A quick guide to the required commands: start Weka; select Explorer; use ‘Open file' to load a dataset; go to the Classify tab and use the Choose button to pick J48 from the trees. Scroll around in the ‘Classifier output' and try to understand what you see there.

I) Explore the evaluation options (test options in the Classify tab, on the left under the Choose button). Understand what they do in light of Chapter 5 (it is fairly straightforward, but you can also consult the Weka documentation or Google). Build/evaluate a tree under each of the 4 options (use the default whenever there is a parameter). Report the "accuracy" for each option and write a sentence or two about your observations (look at the summary in the Classifier output and identify the accuracy as the percent ‘Correctly Classified Instances' - you can ignore all the other stuff for now).

II) Figure out how to get predictions out of Weka (try the "More Options" button in the Test options) and copy a dozen of them from the ‘Classifier output' window here.

III) Identify the most INFORMATIVE attribute (according to the tree induction) and explain how you found it.

IV) Examine the parameters of the tree induction by clicking on J48 in the box just to the right of the ‘Choose' button. Set "unpruned" to True. Now, try changing the values for ‘minNumObj' and see (i) how it affects in-sample accuracy by evaluating on the training set, and (ii) how it affects the generalization accuracy using the test set. Explain the results. Use the concepts from the readings where appropria

Verified Expert

Reference no: EM13498969

Questions Cloud

The total cost in completing the project in normal time : S M Construction has been awarded a contract to build a new manufacturing plant just outside Cardiff. The activity on arrow table below gives data on the activities involved in the plant’s construction.
Explain the temperature of the cold tap water : The temperature of the cold tap water is 22.0°C, and the temperature of the hot tap water is 65.0°C. If a student starts with 70.0 g of cold water, what mass of hot water must be added to reach 37.0°C?
Matthew lewis 1796 novel the monk is, among other thing : Matthew Lewis's 1796 novel The Monk is, among other thing
Define what is the solubility of n2 in a diver''s blood : As a scuba diver descends under water, the pressure increases. At a total air pressure of 2.70 atm and a temperature of 25.0 *C , what is the solubility of N2 in a diver's blood
Estimate probability of default for a credit application : What would you suggest as the target variable and what would you recommend that Joe do to achieve his business goal?
On seperate sheet of paper : On seperate sheet of paper
Explain what is the half-life of a compound : What is the half-life of a compound if 76 percent of a given sample of the compound decomposes in 43 min? Assume first-order kinetics.
Find the force that the door frame exerts on each hinge : A door has mass M, width W, and height H. It is attached to a door frame by only two hinges: one at 5H/6 (the top hinge) and one at H/6 (the bottom hinge). Calculate the force that the door frame exerts on each hinge
Development of a comprehensive evidence-based project plan : This capstone course provides an opportunity for students to complete the development of a comprehensive evidence-based project, plan, or proposal that addresses a problem, issue, or concern in their professional practice and can be implemented upon ..

Reviews

Write a Review

Programming Languages Questions & Answers

  Implementing class called card for standard playing card

Design and implement a aclass called Card that represents a standard playing card. each card has a suit and a face value.

  Write program to input number of hours worked

Write a program that allows the user to input the number of hours worked and hourly pay rate for employees and outputs their total pay.

  Consider the problem of constructing crossword puzzles

Consider the problem of constructing crossword puzzles: fitting words into a grid of intersecting horizontal and vertical squares. Assume that a list of words (i.e. a dictionary) is provided and that the task is to fill in the squares using any su..

  Design program to asks for number of fat grams

Design a program that asks for the number of fat grams and calories in a food item. Validate the input as follows: Make sure the number of fat grams and calories are not less than 0.

  Express javascript code that called to validate text-field

Write a JavaScript function to validate a text-field on a form that is to hold an email address.

  Write a paper on memory management

Write a paper on Memory Management

  Find out a web site which describes the use of a selection

find a web site that explains the use of a selection structure or an iteration structure in programming. provide the

  Write an application to calculate the factorials

Write an application that calculates the factorials of 1 through 20. use type long. Display the results in tabular format. What difficulty might prevent you from calculating the factorial of 100?

  Write statement to call calculate interest method

Write the statement(s) necessary to call the calculate interest method with an account balance of $2300.00 and an interest rate of 5%.

  Examine the new system and find out the design issues with

a large fast-food chain unveiled a new touch screen register for its franchises. each cashier was assigned a user id

  Create a class-how to cash goods-give change to customers

Write class called Cashier that directs a cashier how to cash goods and give change to customers. Typical cashier operations are as follows.

  Compute the product xy

Given two integers X and Y compute the product XY (multiplication), the quotient X=Y (integer division), and the modulus X (mod Y) (remainder).

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd