Reference no: EM132166493
Question One.
Create a single table that displays the mean interest rate charged foreach combination of loan grade and loan status. Briefly describe any patterns that you observe from the table.
Create a table (or expand your table from part a.) so that the numberof loans for each combination of loan grade and loan status is also displayed. Briefly describe any pattern(s) that you observe.
Construct an appropriate graph for displaying the percentage of loans that are "Charged Off" by loan grade.
Question Two.
Construct a graph that displays the number of loans made for each month from June 2007 through December 2011. What, if any, trend do you observe?
Modify your graph from part a. to determine if the trend over time is consistent across loan grade (i.e., do you see the same pattern for each grade). What do you find?
Question Three.
Consider the variable rev_util (revolving credit utilization). We would like to learn if this variable may be a good predictor of loan outcome. Construct a table or graph (choice is yours) that helps answer this question. What do you conclude?
Question Four.
Is income related to loan grade? First, filter the data so that only observations having incomes less than $250,000 appear. Use the filtered data to create a pivot table with income as the row variable and loan grade as the column variable. Group income from $0<$50,000;$50,000<$100,000;$100,000<$150,000; $150,000<$200,000;$200,000<$250,000. Use your pivot table to answer the following questions:
What is P(Grade A, B, or C |$0<$50,000)
What is P(A,B, or C|$100,000<$150,000)?
Do these values appear to be significantly different? What does this say about the relationship between income and loan grade
Question Five.
Construct a 95% confidence interval for the following:
Debt to Income Ratio for loans that were fully paid in 2010. Interpret this interval in the context of the problem.
The difference in the Debt to income ratio in 2010 for loans that were fully paid versus those that were charged off. Interpret this interval in the context of the problem. We cannot assume that the variance in dti for fully charged loans is equal to the variance in dti for charged off loans.
Question Six.
Conduct a hypothesis test to determine if the proportion of loans that were charged off in 2008 is different from the proportion of loans that were charged off in 2011. Use α = .01. What is your conclusion? Given financial events during that time period, is this the result you expected?
NOTE: Here is the list of lending club variables that you will need for this project. The first column contains the column heading, the second column provides a brief definition. For a more complete description of the variable, refer to the Lending Club data dictionary.
int_rate
|
interest rate
|
grade
|
loan grade
|
loan_status
|
loan status
|
issue_d
|
loan issue date--month and year
|
revol_util
|
revolving credit utilization
|
loan_status
|
loan status (outcome fully paid or charged off)
|
annual_inc
|
annual income
|
dti
|
debt to income ratio
|
Attachment:- home work.rar