Alternative data-powered machine learning modelling for digital lending
Published on February 8, 2022 by Mrinal Shankar
The two most common challenges banks face in acquiring new customers for lending products are the following:
- Finding potential borrowers for lending products at lower marketing spending
- Rejecting loan applications accurately to reduce non-performing loans (NPLs)
Banks could reduce their marketing expenses by more focused customer targeting using the power of machine- and deep-learning models. Banks mostly use in-house data to build independent features for training the models. Only a few banks use alternative data sources to capture signals relating to behavioural patterns of prospective customers. Once a customer applies for a lending product, the application is screened by the bank’s underwriting process. Banks commonly use only bureau data to build credit risk or probability of default (PD) models for underwriting prospect applications. Bureau data comprises information such as credit scores, lines of credit, utilisation, revolving balance and delinquency that help measure a customer’s creditworthiness. However, not all customers have sufficient credit history because either they would never have taken institutional credit or would have a thin credit history. Hence, most of them get rejected by traditional underwriting models. Alternative data sources help generate behavioural patterns of such customers and, in turn, more information on their creditworthiness.
Lending acquisition strategy
While banks commonly use their own data for customer relationship management (CRM) and other marketing services, significant opportunity still exists to leverage the large volumes and dimensions of alternative data sources. Artificial intelligence (AI) and natural language programming (NLP) are used to extract information and patterns from structured and unstructured alternative data sources, some of which are mentioned below.
The diagram below depicts the process of using these alternative data sources as input in developing predictive models to target potential borrowers and developing credit underwriting models.
Creating intent from alternative data sources
Lending intent refers to customer needs for which they will most likely require credit; this could be generated from the signals extracted by alternative data sources. Such intent could be used by banks to target customers according to their need for a particular lending product and at the right time. Customers knowingly or unknowingly leave digital a footprint, and AI could be used to extracting signals from alternative data sources.
Types of leads for a bank
The underlying features used to develop predictive models differ between the two broader customer segments — existing bank customers and prospective customers. These segments can be further divided into sub-segments based on their lending product relationship with the bank and other banks. The diagram below depicts the types of leads for a bank.
The propensity to take up a lending product or default on a lending product depends largely on existing or past lending holding patterns. Hence, building models separately for the abovementioned lead categories helps increase model performance in addition to strategising targeting campaigns.
Lending product uptake models
Propensity models for existing bank customers identify top customers more likely to take up lending products within a determined timeframe. In the absence of previous lending campaign data, the propensity model identifies top customers more likely to show interest in a particular lending product. Alternative data sources could be linked with bank data by deterministic and probabilistic ID matches or using third-party link providers. Deterministic1 matching involves joining devices using personally identifiable (PII) data such as email, name and phone number. Probabilistic matching device relationships use a knowledge base of linkage data and predictive algorithms. The following diagram shows the type of propensity models that could be built for the leads:
Over 45m adults in the US (19%) lack traditional credit scores that lenders could use to qualify them for loans, according to the Consumer Financial Protection Bureau2. Over half of these adults are “invisible” because they have no credit score at all. Customers with thin credit histories include young customers who are new to credit or customers who have paid their debts early in life. Forty-two percent of adults in the US have low credit scores and are considered to be “non-prime” customers. A segment of these customers would have been financially better in the past, and alternative data sources could be used to assign them better credit risk than the credit scores suggest. This segment is categorised as “lendable with alternative data” in the diagram below. Similarly, some customers from the “Good customers (credit score >680)” segment could exhibit default behaviour according to alternative data sources.
About the Author
Mrinal has 11+ yrs. of experience in delivering data-driven insights to financial services clientele using supervised and unsupervised modeling methodologies. Prior to joining Acuity, Mrinal worked with Accenture’s Applied Intelligence Group for Financial Services and Citibank in their Advanced Analytics team.