Reference no: EM132343848
Assignment -
Answer the following questions. Show output from the various programs used to answer. Answers should be uploaded in a neat, easy-to-read Word document. Move all graphs, charts, and tables to the single document. Do not upload spreadsheets. Be sure to read this week's written lecture for links and other helpful information. Necessary datasets are located in the content folder. These questions ask you to explain, describe or outline something in addition to the program output. The essay parts count for about 50 percent of the points in your answer, so be sure and include well-considered, detailed explanation and discussion in your own words. Use APA style references and citations if needed. Copying and pasting or similar plagiarism/cheating will result in zero points on the entire assignment. These questions are from Chapter 7 of Shmueli, Bruce, and Patel.
Universal Bank has a number of depositors and a small number of borrowers. The banks wants to expand the number of borrowers by selling personal loans to its depositors. The marketing department wants to devise a better target marketing campaign by using k-NN to predict whether a new customer will accept a loan offer.
The file UniversalBank (See attached) contains data on 5000 customers with demographic information, the customers' relationship with the bank, and the response to the last personal loan campaign. Of the customers, only 9.6 percent accepted the previously offered personal loan.
Be sure and transform categorical predictors with more than two categories into dummy variables and partition data into training and validation sets (60/40). It is not necessary to transform predictors using SPSS.
1. Consider the following customer:
Age=40, Experience=10, Income=84, Family=2, CCAvg=2, Education_1=0, Education_2=1, Education_3=0, Mortgage=0, Securities Account=0, CD Account=0, Online=1, and Credit Card=1. Perform a k-NN classification with all predictors except ID and ZIP code using k-1. Specify the success class as 1 (loan acceptance), and use the default cutoff value of .5. How would this customer be classified?
2. What is a choice of k that balances between overfitting and ignoring the predictor information?
3. Show the classification matrix for the validation data that results from using the best k.
4. Consider the following customer:
Age=40, Experience=10, Income=84, Family =2, CCAvg=2, Education_1=0, Education_2=1, Education_3=0, Mortgage=0, Securities Account =0, CD Account=0, Online=1, and Credit Card =1. Classify the customer using the best k.
5. Repartition the data, this time into training, validation and test sets (50%, 30%, 20%). Apply the k-NN algorithm with the k chosen above. Compare the classification matrix of the test set with that of the training and validation sets. Comment on the differences and the reason for it.
Background: A relatively young bank is growing rapidly in terms of overall customer acquisition. Majority of these are Liability customers with varying sizes of relationship with the bank. The customer base of Asset customers is quite small, and the bank WANTS to grow this base rapidly to bring in more loan business.
Specifically, it wants to explore ways of converting its liability customers to Personal Loan customers.
A campaign the bank ran for liability customers last year showed a healthy conversion rate of over 9% successes. This has encouraged the Retail Marketing department to devise smarter campaigns with better target marketing.
Analytics Objectives:
1. While designing a new campaign, can we model the previous campaign's customer behavior to analyze what combination of parameters make a customer more likely to accept a personal loan?
2. There are several special products / facilities the bank offers like CD and security accounts, online services, credit cards, etc. Can we spot any association among these for finding cross-selling opportunities?
Attachment:- Assignment Files.rar