Table of Contents
When it comes to mitigating financial risks, organizations require information from outside of their four walls to make accurate predictions. Internal data provides insights based on historical information from within the organization. It does not factor in any important external factors. Risk officers and analysts across financial services companies and fintechs realize that risk modeling requires better data. The accuracy of sophisticated predictive analytics and machine learning algorithms depends on relevant external data.
Today, risk modeling, especially for B2B financial institutions, fintechs, and online lenders, has many new challenges. With the right data, organizations can train more accurate predictive models, and reduce financial and credit risk. This blog article will cover new trends in risk analysis, risk modeling, and how to increase model accuracy.
Trends in Financial Risk Management
In its Global Economic Crime and Fraud Survey, PwC highlighted that fraud rates remain at record highs, with more than half of all businesses reporting an accumulated total cost impact of $42 billion over the last two years.

Today, the prevailing types of risk for fintechs and B2B lenders are:
Default Risk: Fintechs and online lenders tend to rely on traditional credit scores and bank statements to determine the creditworthiness of potential borrowers. However, these are not the best indicators of whether a borrower will make timely repayments. Inaccurate risk assessments lead to a higher rate of loan defaults.
Fraud Risk: For online lenders in particular, the risk of fraudulent applications tends to be higher since the application process is online and remote, while thorough background checks are often not conducted. Basic metrics such as business and owner name, address, IP address, and business creation date are still used to make lending decisions — all of which cannot and do not adequately signal fraud risk.
Incorporating relevant external data increases the accuracy of loan default predictions and flagging fraudulent activity on loan applications.
How to Measure Risk
Risk measurement is driven by historical relationships in data and reasonable assumptions. The key to making accurate predictions is having suitable datasets and understanding how they relate.
As internal data doesn’t provide the full picture, organizations are seeking out external data to increase predictive accuracy. Before purchasing any external data, there are some important questions to consider:
- What external sources, if any, are you already leveraging?
- Does your data distribution make sense?
- Is the data representative of a broader population?
- How do potential default rates look?
In data science, one piece of data might be relevant but will not have an impact until it is combined with other data points. Having this relation with other data helps you see patterns that can help drive better decision-making.
The goal of risk modeling is to prepare for any event or oddity that may affect how a business operates. Data analysis based on historical data doesn’t prepare organizations for what could happen in the future. COVID-19 was a perfect example of an event that made models trained on historical data obsolete. The pandemic caused forecasting models based on historical data to break down, leading companies to look outside of their internal databases to get COVID-19 related information to rebuild, retrain, and recalculate risk models.
The processes of gathering, refining, and integrating external data (which typically takes months) also had to be sped up so that risk models could be retrained to shift towards real-time analytics and decision making.
Challenges of Mitigating Risk
Finding the right data: Many data points can help drive decision-making processes. There are thousands of data signals available, but not all of them are relevant to your business. You don’t want to waste time and money finding and purchasing irrelevant data. On the other hand, there is ample data available that you might not realize exists, that could boost the accuracy of your predictive models.
When making lending decisions for small businesses, for example, looking at data points such as online brand sentiment (ratings and reviews), media presence, and web traffic or foot traffic trends all add up to help you get a better overall picture.
Data cleaning, preparation, and matching: Purchasing an external dataset does not mean that it is ready for use in existing data pipelines or predictive models. Chances are that it will need to be reformatted to match your existing data. This can be a time-consuming process.
Updating risk models frequently or in real-time: Risk models need to be agile. Accurate predictive modeling and comprehensive competitive analysis can be the difference between a company that differentiates itself and one that falls behind the leaders in the marketplace.
Small business fraud is on the rise. In a recent statistic from LexisNexis, SMB lenders have experienced a 7.3% increase in small business fraud over the last two years — equivalent to a total revenue loss of up to 6%. Based on this trend, lenders need to stay ahead of the curve with agile and accurate predictive models.
Getting The Right Data
Even when companies know they need external data, how can they acquire and implement the right data?
Manually search for external datasets: There are many different data marketplaces where you can purchase external data. While this might be a good option on a small scale, many enterprise problems require more than one external dataset, making the data marketplace option time-consuming and expensive.
Get the right external data automatically: There are many different data categories available from multiple sources. The goal is to access multiple different external datasets, such as company profiles, employment information, web analytics, and market value in an automated fashion, combine them, and do something useful with them.
Derive the right features: As important as having the correct data is, getting the right features is equally crucial. For example, the first thing you may want to look at is if a company is on a fraudulent IP list. Are there other red flags? When did they register their business? What sort of online presence do they have? What kind of previous claims have they made? All of these indicators will help you inform lending decisions.
Train and deploy a top-performing model: Building and deploying a model is not a one-time process. As more data is gathered from internal and external sources, you continuously tweak and tune your models, giving you the best-performing risk model for your organization.
The Impact of Using Explorium
BlueVine, a small business lender, uses Explorium’s external data platform to discover and enrich data used in its automated loan approval process, leading to rapid and accurate decisions. For BlueVine, the injection of external, enriched data enhances multiple areas of their business including operational efficiency, risk and fraud detection, and regulatory compliance.
They were looking to enhance and automate their loan approval process in order to grow their business and increase operational efficiency. To achieve this, they needed the most current and relevant data to drive accurate and automated decision-making processes. They use Explorium to continually enrich, expand and enhance their loan approval process. With Explorium, they are able to discover new signals and use them to regularly recalibrate their model.
Three types of signals shown to significantly contribute to the accuracy of automated loan approval/reject decisions are web presence, NAICS industry classification code and geocoding. For example, information about an SMB’s website, its ranking and traffic level provide risk and fraud indicators for the BlueVine decision model. Knowing the industry in which the SMB operates and where it is located, accompanied by geographically-relevant data, all associated with good credit risk in the company’s decision model, contribute to BlueVine’s ability to provide quick and accurate decisions.
With Explorium data, BlueVine achieves a continually optimal automated loan approval process, saving time and cost and facilitating business growth. More accurate loan approval decisions based on new data/information is also leading to lower risk and reduced fraud rates
Unlock the power of external data
Automating the process of finding and incorporating external data to manage risk is more accurate, scalable, and repeatable. Using an external data platform helps with every step of this process from data discovery to acquisition, preparation, integration, model training, compliance, deployment, and retraining.
Organizations today understand the importance of data assets. However, leveraging the right data goes beyond simply purchasing a new dataset. Having one platform to assist every step of the data journey will help leverage the benefits of external data more efficiently and effectively.
The adoption of external data for in-depth analysis is growing rapidly. Chief data officers are leading the way, bringing external data on board as part of their companies’ overall data strategy.
To learn more watch Explorium Head of Data Science and Data Evangelist Victor Ghadban’s most recent talk at the Big Bank Conference.
Download our white paper “The Definitive Guide to External Data for Fintech”