Reference no: EM133757460
Data Acquisition and Management
Assessment - Practical SQL Coding and Report
Your Task
You are required to:
Complete each of the SQL queries, both provided and self-constructed.
Take a screenshot of each of the queries and results - copy and paste these into your report.
In your report, explain and summarise the business insights found from executing the SQL queries and the data visualisations.
Submit your query script file (.sql) with all your queries using the SQL link within the assessment time limit of 2 hours, i.e. In class.
Background
The SQL Server contains data on Peer-to-Peer Lending. Connection details will be provided on the day of the assessment.
Section #1: Server Connection and Simple SQL Queries
Part 1: Updating the table to add a column
Details will be provided in class.
Part 2: Creating a SELECT Query
Details will be provided in class.
Section #2: SQL for Business Insights
Part 1: Creating and analyzing SQL Queries relating to Loan and Funded Amounts
Take screenshots of the above queries, its result and paste same in your report for interpretation.
Part 2: Creating SQL Queries relating to Loan Terms
Take screenshots of the query and its result and paste that in your report for interpretation.
Part 3: Creating SQL Queries relating to Interest Rates
Create an SQL statement that answers the following business questions:
Take screenshots of the query and its result and paste that in your report for interpretation.
Part 4: Creating SQL Queries relating to Loan Statuses
Take screenshots of the query and its result and paste that in your report for interpretation.
Part 5: Creating SQL Queries relating to Loan Grades
Take screenshots of the query and its result and paste that in your report for interpretation.
Part 6: Creating SQL Queries relating to Loan Defaults/Delinquencies
Take screenshots of the query and its result and paste that in your report for interpretation.
Section #3: SQL and Data Visualization for Business Insights
Part 1: Data Acquisition and Data Visualisations
Create a SELECT statement that returns all rows and columns of data.
Export and save the results of the query as CSV file.
Import the CSV file in either PowerBI or Tableau.
Using the above CSV file, carefully examine the columns that offer potential for creating at least three distinct visualizations.
These visualizations should aim to uncover valuable business insights to assist the manager in evaluating and approving loan applications more effectively. For instance, one visualization could analyse customer demographics to identify those most prone to default on their loans.
Take screenshots of each of the visualisations and paste that in your report for interpretation.
The Lending Club Dataset
Hands-on SQL
• During our hands-on SQL workshops, we practiced on the Lending Club 5K dataset, that is, the data was for 5000 customers.
• For Assessment #2, your lecturer will provide you in class with a similar dataset in week 8. This contains data on 10,000 customers.
Lending Club Dataset for Assessment #2
• For the assessment, the bank loan data dataset will have 10000 customers (10K).This dataset will be given to you in class in week 8 by your facilitator.
Your SQL File to Upload
For submission, your SQL script must be labelled in the following manner:
your_name_studentID_A2.sql
You must also repeat this inside your SQL script as an SQL comment.
Inside your SQL script, write the relevant SQL queries under the proper headings in the script.
This script template will be emailed/givento you by your facilitator in Week 8 on commencement of class. Please rename the downloaded file as follows: your_name_studentID_A2.sql
Lending Club Data Dictionary
The Lending Club dataset has the following features (columns of data):
• ID (id): this is the ID of the loan
• Member ID (member_id): this is the ID of the customer (the member)
• Loan Amount (loan_amnt): the amount the customer applied for
• Funded Amount (funded_amnt): the amount funded to the customer
• Term (term): the term of the loan i.e., 36 months or 60 months
• Interest Rate (int_rate):the interest rate charged on the loan
• Instalment (installment): the amount the customer regularly pays
• Grade (grade): Lending Club classification of the class of loan (type)
• Sub-Grade (sub_grade): the sub-grade classification
• Employee Length (emp_length): length of time customer has been employed
• Home Ownership (home_ownership): indicates the living situation of the customer - RENT, MORTGAGE, OWN, OTHER, and NONE
• Annual Income (annual_inc): the annual income of the customer
• Verification Status (verification_status): indicates whether the income source has been verified
• Issue Date (issue_d): when the loan was issued to the customer
• Loan Status (loan_status):
1. Current - the loan is currently active
2. Charged Off - loans which have been written-off i.e., a loss to Lending Club
3. Issued - loan has been issued to the customer
4. Fully Paid - indicates that the customer has completely paid off their loan
5. In Grace Period - customer has been granted a grace period for repayment
6. Late (31-120 days) - indicates that the customer is late with repayments
7. Late (15-30 days) - indicates that the customer is late with repayments
8. Does not meet the credit policy. Status: Charged Off
9. Does not meet the credit policy. Status: Fully Paid
10. Default - indicates that the customer has defaulted on their loan
Notes: loan_status column is a text data type so be very careful how you use SQL to query for loan statuses.
• Purpose (purpose): the purpose of the loan i.e., the reason for the loan application
• Address State (addr_state): the American state where the customer lives
• DTI (dti): A ratio calculated using the borrower's total monthly debt payments on the total debt obligations, excluding mortgage and the requested LC loan, divided by the borrower's self-reported monthly income
• Earliest Credit Line (earliest_cr_line): the month the customer earliest reported credit line was opened
• Outstanding Principal (out_prncp): remaining principal amount on the loan
• Total Payment (total_pymnt): payments received to date for the amount funded
• Recoveries (recoveries): post ‘Charged Off' amount recovered from customer
• Last Payment Date (last_paymnt_d): the last time the customer made a payment on the loan
• Last Payment Amount (last_payment_amnt): the amount the customer last paid on the loan
• Application Type (application_type): either INDIVIDUAL or JOINT
HINTS:
• For the SQL query, think carefully about the data type i.e., text or numerical?
• For visualisations: make sure that each column of data imported into PowerBI, or Tableau,have the correct data type.
• If the column data type is not correct, the visualisation may not be what you expected, and the results may influence the insights obtained. For example, the member_id is simply an ID, and so it makes no sense to add member ids together!
• For visualisations: carefully consider the chart type and combinations of columns of data to use for discovering the insights.
Attachment:- bank_loan_data.rar