Back
small business default risk and machine learning

Should You Use ML To Predict Small Business Default Risk?

November 23, 2020 Explorium Data Science Team AI Education

Small businesses looking for loans today have more choices than ever. Online providers are lowering barriers to entry, making it easier and faster to apply for credit. The trouble is, this also means more credit for the risky borrowers. A risk that is, of course, passed on to lenders. 

But what’s a lender to do? Raising the bar across the board means shutting out potentially valuable customers who will take their business straight to your competitors. Failing to improve standards increases the chance that a significant proportion of these new customers will default on their loan repayments. 

Meanwhile, old-school indicators like lending histories and company financials only offer part of the picture. They don’t take into account domain knowledge. They don’t help you figure out where the gaps in your understanding are or what datasets you should look at to fill them in order to enhance how you predict risk. 

financial services

The answer isn’t over-caution; it’s to be smarter about how you calculate and predict risk in the first place. Risk modeling today is a data science problem. Basing your risk models on artificial intelligence — specifically, machine learning (ML) — offers a way to address these complexities, creating safer application and approval processes. What’s more, it does it in a way that doesn’t undermine the user experience for applicants.  

Why SMB lenders need to embrace machine learning

ML elevates your risk models in three key ways:

  • It reduces the costs associated with false positives and negatives

Making the right decisions means accessing the right data. With an ML risk model, you can look beyond your limited, internal, historical data and take into account any relevant, up-to-date external data sources that aid your decision-making and improve how you predict risk.  

This may sound very complex and sophisticated, but the beauty of ML is that a) it automates processes wherever possible and b) continually learns and improves its predictions, the more data you feed into it. In other words, you can scale up and incorporate as many new data streams as you like without adding significantly to your workload. What’s more, rather than overloading the system with bigger and bigger datasets, the more quality data you feed in, the more you help your model to perform better. A machine learning risk model, by its very nature, becomes more effective the more you use it and the more up-to-date, accurate data it receives. 

Quality is the key element here, rather than quantity: you need to consistently feed your model with good data, rather than just shoveling in more and more. The latter can lead to all kinds of problems, including overfitting and data leakage, leading to false positives, bias, and misleading models.This is why the right data is vital. You might have troves of your own data, but if all it’s saying is the same thing, then your models won’t evolve and get better at their tasks. Having better data lets you create  complex models that weigh factors against each other and tease out connections and relationships. Rather than applying an overly-rigid, rules-based system that creates too many false positives and negatives, you can build dynamic models that focus on the nuance and subtle patterns. 

All of which boils down to this: using ML gives you fewer false positives and false negatives. Fewer loans granted to those applicants that aren’t a good fit for credit. And more loans for those who may look risky at first glance but are actually financially sound. And in turn, a reduction in the costs associated with making mistakes. 

  • It adds nuance to your default risk models 

Using a rules-based system may help you learn from past mistakes and successes. And it may eventually, painstakingly lead you to the right decision. But this is far from an elegant, effective solution. Often the process is convoluted and completely lacks nuance or context. The system is also designed to spot red flags rather than subtle weaknesses; to reward solid credit histories rather than to predict future success.  

A small business that has only been operating for a few months may, for example, seem like a far riskier prospect than a larger one that has been in continuous operation for years — even if the new business produces a revolutionary product that seems set to make the old one obsolete. A rules-based system ignores so many details that could point to true reliability and ability to repay when it comes to how you predict risk. 

A risk model driven by machine learning, on the other hand, means you can trawl external datasets for valuable insights and choose better signals to feed your model. This allows you to contextualize what you already know about the customer, gaining a complete and nuanced understanding of their history, behavior, and the risks associated with taking them on. 

This means that, when deciding whether to lend to a small business, you wouldn’t need to limit your risk modeling to data from the company’s financials and growth projections, the director’s beneficial ownership information, and similar information. With a machine learning model connected to the right external datasets, you could, for example: 

  • Take into account broader industry trends that indicate how a business like theirs will fare in the future. 
  • Explore sales patterns and demographic information in their specific area.
  • Use figures on business rates and average rental costs in their zip code to assess how well equipped the company will be to survive, should business slow down for a month or two. 
  • Enrich the data you have on their investors or funders to see the success rates of other startups in their portfolio. 

All of which would help you predict with far greater precision the chances of this small business defaulting on their loan repayments.

  • It helps you create a fast, reliable approvals system 

As we’ve seen, the more data you feed into your machine learning model, the more accurate your predictions become, and the more reliable your decision-making process will be. How you predict risk becomes more accurate. 

But there’s more. By adopting this approach, you not only get better results; you also get them much faster. You can make lending decisions quickly and with confidence. Best of all, you can pass on these benefits to your customers, offering them a frictionless application and onboarding process. You reduce your risk while minimizing their frustrations. It’s a win-win situation. 

Final thoughts: predicting risk in a chaotic world

With a global business landscape this unpredictable, the case for machine learning in your risk models has never been clearer. There is simply no way that either your internal historical data or an outdated rules-based system are equipped to make sense of risk in 2020 or beyond. 

Rules are rules. They aren’t dynamic. They don’t evolve and adapt to the reality on the ground. To get the insights required to make accurate, reliable lending decisions, you don’t need rule-based systems. You need tools like augmented data discovery. This gives you access to up-to-the-minute, relevant data on everything related to this small business, its product, the performance of the industry, and the mood and behaviors of its potential customers. You need models that combine signals and data from across the spectrum to tease out patterns and risk predictors you’d never have considered otherwise. 

You need models that cut costs by reducing errors. That deliver predictions you can trust, at speed. That form the foundation of better, more streamlined customer experiences, allowing you to onboard new borrowers safely and efficiently. That reduce risk without eliminating opportunities. You need machine learning.

financial services

Subscribe Today! Get the latest updates with our newsletter.
We promise you'll love it.

Follow us

We're Hiring! Join our global family of passionate and talented professionals as we define the future of data science. Learn More