Blog post

Mobile Data + Machine Learning = Better Credit Scoring for the Underbanked

“Credit invisible” consumers are pervasive both in developing and mature markets. Including them in the financial system can bring major advantages for FinTechs and banks

March 18, 2020

8 mins read

What would you do if someone asked you to prove your identity and you’d left your ID at home? You’d probably reach for your mobile phone and fire up Facebook, LinkedIn, or your online banking app.

Indeed, our phones now contain a huge part of our lives. Lenders could take advantage of all that mobile data to assign credit scores to people with little to no credit history.

Mobile Data + Machine Learning = Better Credit Scoring for the Underbanked

NEW AGE IN BANKING WHITEPAPER

How banks can switch to more profitable operating models
Download now

How big is the lending gap on a global scale?

Roughly 1.2 billion people lack access to traditional financial services. But as we highlighted in our New Age in Banking Whitepaper, unbanked does not equal insolvent.

  • In the EU, 33% of unbanked consumers are employed full-time and 9% are students.
  • In Canada, among the top 60% of households, 3% of adults don’t have a traditional bank account.
  • In Egypt, where the penetration of traditional financial services is low, the informal economy is estimated at $90 billion.

Approaching the unbanked demographic alone can help banks capture about $380 billion in annual revenue.

But in most cases, unbanked and underbanked also means “credit invisible.” And here, the figures get even more staggering:

  • One in ten US adults has no credit record with any of the three major credit reporting companies. That’s over 28 million customers with thin credit files and 25 million without one at all.
  • In the UK, almost 5.8 million people are off the traditional financial grid.

Many consumer advocates and legislators claim the current credit score system is broken. In order to receive a loan, you already need to have some positive credit history. But you can’t build that history without getting approved for that first loan or credit card.

At Intellias, we think that traditional credit scoring isn’t broken per se. It just needs a tech revamp to become more accurate and inclusive. Alternative credit data is the key.

How alternative data makes lending cost-effective and risk-free

Most banks still rely on traditional credit records to assess the creditworthiness of a potential lender. As a result, they have a rather simple view of their customers. Everyone who’s off the grid does not exist.

But in today’s world of total connectivity, over-reliance on a single data source brings more limitations than opportunities.

With traditional credit scores, it is clear that you have to throw out a lot of likely goods in order to limit the number of bads you let in. Among the marginal turn-downs, there are many more goods than bads — they just can’t be distinguished using current scores based on current data sources and methodologies.

Oliver Wyman, Alternative Data and the Unbanked

Remember that recent Dolly Parton challenge? Every one of us has different sides we choose to share with the world via different mediums (such as social media). And every one of us carries a unique digital footprint that showcases who we are, what we do, and how we choose to earn and spend our money.

By learning to mine and operationalize alternative data sources powered by machine learning development services, banks can extend more lending offers to customers without exposing themselves to additional risks.

Chinese super apps are a prime example of how the right technological infrastructure makes lending to credit-thin customers secure and effective.

WeBank offers personal and business microloans to consumers through Weilidai and lets WeChat Pay users apply for such loans directly from the WeChat Pay app. Most recently, the bank also launched a new lending product on Tencent video.

By leveraging the ABCD tech stack — AI, blockchain, cloud, and data — the three leading Chinese digital banks (WeBank, MYBank, and XWbank) now process over 10 million loan applications annually while having just 1,000 to 2,000 employees each. What’s more, their non-performing loan (NPL) ratio is 1% on average.

Mobile Data + Machine Learning = Better Credit Scoring for the Underbanked
Read more: Get to know more about FinTech super apps and how they’re paving the way for a new platform-centered future of finance

Customer data has become the central means of obtaining such low NPLs. WeBank relies on a mix of users’ payment records, calls, and social media messages to come up with an accurate score. MYBank uses e-commerce data for approving customers. XWBank leverages a mix of alternative credit data from other tech platforms.

If you plan to follow their lead, below is a quick list of data sources to consider for your product.

Alternative credit score data sources

  • Data from energy services providers can help prove a person’s residence and their ability to pay utility bills on time.
  • Rental payments also beef up customer profiles. A study by the NYC Comptroller showed that pairing Experian data with rental payment data helped 28% of New York City residents gain a credit score for the first time. In fact, they racked up 700 points on average.
  • Asset ownership and employment history are strong proxies for creditworthiness.
  • POS and transaction data can provide more context about a user’s spending behavior and money management practices. Uulala, a FinTech player from Latin America, is experimenting with this approach.
  • Self-reported bank data. UltraFICO (in the late pilot phase) will let consumers submit personal checking and savings information to raise their credit scores.
  • Data from FinTechs. P2P lending websites, investment and wealth management apps, and even personal finance management tools contain a host of accurate customer information. With Open Banking and a growing API ecosystem, gaining access to such data is simply a matter of technical expertise.
  • Telecom and mobile data so far have emerged as the strongest contenders. The modern phone hosts plenty of data points about its owner. And the savviest financial companies are already using those insights to build new lending products.

How to use mobile data for credit scoring

Mobile Data + Machine Learning = Better Credit Scoring for the Underbanked
Source: KT – Credit scoring solution based on telecom data

1. Collect basic customer information

To be effective, every credit model needs some initial data points. In the case of data on a mobile device, these are:

  • Customer’s full name
  • Date of birth
  • Address associated with the phone number (if any)
  • Historical data on phone payments/top-ups and other bill payments
  • General call patterns and frequent callers

Tala, a microlending company servicing customers in Kenya, Mexico, the Philippines, and India, asks every user to provide basic data first. Doing so allows them to conduct a preliminary KYC check for unbanked customers and weed out fraudsters.

Later, the Tala app uses additional data points to estimate an applicant’s capacity and likelihood to repay:

  • Android device data — geolocation data, frequently used apps and services
  • Behavioral data — VOC patterns, overseas roaming, call and top-up patterns, etc.

Ultimately, their scoring model relies on 250 data points that establish a correlation between certain behaviors and good/bad credit performance.

2. Capture mobile payment data

Where traditional bank penetration rates are weak, mobile payments reign supreme. These are your second most important data source.

With a user’s permission, collect the following information from mobile wallets:

  • Spending and transaction data
  • Information on P2P transfers, top-ups, and cash pick-ups
  • Information on loyalty points and cards

Mobile money services (e.g. M-Pesa) are another rich source of financial data. In developing countries, such players have effectively replaced traditional banks. This means people often choose mobile money companies to receive their salary, pay bills, and settle other financial affairs. You may want to look into ways of accessing data from these services.

3. Request additional data from users

To enhance your credit scoring algorithm, allow users to self-report additional data. You can create personalized questionnaires or forms to collect those missing pieces of the puzzle.

Consumers are increasingly on board with this. According to Experian, 58% of US consumers say the ability to contribute their payment history to their credit file makes them feel more empowered.

In mature markets, potential lenders are most willing to share:

  • utility payment history
  • paycheck stubs/income information
  • checking/savings transactions

In developing countries, you should look into data on:

  • remittances (domestic and foreign)
  • freelance/self-employment income
  • any past evidence of lending/credit history with non-traditional service providers

4. Develop a proprietary scoring algorithm powered by machine learning

Machine learning (ML) is the best technological solution for risk scoring models. A recent report by the Bank for International Settlements on analysis of a leading Chinese FinTech company’s loan transaction data concluded that:

  • the tested ML-based credit scoring model performed better than traditional credit scoring models (that use both traditional and alternative data) in predicting borrowers’ losses and defaults
  • alternative data boosts the predictive power of credit scoring models
  • ML-based models can better predict losses and defaults following a negative shock to the aggregate credit supply
  • over time, ML-based models improve their performance as new data becomes available

The key challenge of building such predictive credit scoring models is the sheer volume of mobile data. To effectively operationalize all those insights, you’ll need to establish a strong data governance process and a supporting data management platform.

A data management platform is a cloud storage service for all aggregated and self-reported customer data that is fed to algorithms for customized scoring. To be efficient, your data management platform should:

  • provide real-time and periodic synchronization of incoming data from multiple
  • have a resilient and optimized architecture to support fast processing of high volumes of raw data
Mobile Data + Machine Learning = Better Credit Scoring for the Underbanked
Read more: Learn more about our experience building a data management platform for a German digital bank.

The next step is building a credit scoring algorithm. Its main goal is to assign a custom weight to each data point and calculate the total score for every consumer.

The best machine learning methods to use for this task are:

  • classification
  • binary logistics regression
  • univariate analysis
  • tree-based algorithms (e.g. Random Forest)
  • support vector machines (SVM)
  • scorecard creation

Lastly, don’t forget to frequently validate your model by comparing actual loan repayment data with the initial scores assigned when customers applied.

Extending value beyond lending

Lending can become the first step toward long-term relationships with previously ignored and underserviced consumers. You can (and should!) provide additional value by:

  • teaching consumers to make better financial decisions with a personal finance management tool. Help new users climb out of debt and get better with their day-to-day spending.
  • pitching non-credit products that can help consumers improve their financial standing and transition from taking money from the bank to actively depositing it. Consider different savings accounts, insurance, micro-investing schemes, and other automated wealth management offers that appeal to lower-income account holders.

Invest in improving your users’ financial literacy and creditworthiness to benefit from high loyalty and higher profitability.


Let’s fix the lending process together! Get in touch with the Intellias team to discuss the technical nuts and bolts of building machine learning–powered credit scoring algorithms.

Your subscription is confirmed.
Thank you for being with us.

5.0 Thank you for your vote. 23584 f5b8782dee

Tell us about your project

I give consent to the processing of my personal data given in the contact form above under the terms and conditions of Intellias Privacy Policy. I want to receive commercial communications and marketing information from Intellias by electronic means of communication (including telephone and e-mail).
* I give consent to the processing of my personal data given in the contact form above under the terms and conditions of Intellias Privacy Policy.

Awards and recognition

logo
logo
logo
logo
logo
logo

Thank you for your message.
We will get back to you shortly.

Thank you for your message.
We will get back to you shortly.