How Fintechs Can Use Alternative Data for Improved Predictive Modeling
The emergence of fintechs in recent years can largely be attributed to their increased flexibility, agility, and speed when compared to their traditional banking counterparts. However, fintechs must balance contrasting goals. They must continue to provide customer-centric solutions while safeguarding their organization from risk. They have to attract a constant stream of high-quality, low-risk leads, which is difficult to do while working with limited data.
Fintechs are beginning to realize the power of alternative data sources. The right types of alternative data can significantly improve predictive models, including risk modeling, fraud detection, and lead scoring. Alternative data provides a more comprehensive, well-rounded picture of who prospects and potential borrowers are—which in turn, helps fintechs make better decisions about who to target, extend credit to, and strive to retain over the long run.
What types of data should fintechs acquire?
With 2.5 quintillion bytes of data created every day, the issue isn’t a lack of data—it’s finding the right data. Alternative data in fintech might include geospatial data, person data, company data, or time-based data.
It is also important to consider how to combine these data sources to extract derivative signals for a complete picture of prospects.
For example, some unique attributes that fintechs can leverage using alternative data include:
- An economic stability score for small and medium sized businesses (SMBs) based on assets, liabilities, credit ratings, and revenue trends.
- SMB company data, encompassing years in operation, company type, and online search queries
- Economic risk relative to broader geographic and region data
- Location-specific income and financial stability indicators
- Payment history of previous loans
- Social media activity which can help to legitimize a prospective borrower by determining a social media presence, following, and engagement on relevant channels such as LinkedIn.
- Cost of maintenance vs. predicted revenue
- Alternative risk scores based on a variety of economic and financial health data that indicate fraud risk
- Internet and social media behavior that points to past fraudulent activity
Alternative data in fintech
Fintechs can leverage alternative data in several ways. However, we’ve typically seen it used for three primary use cases: risk modeling for B2B lending, fraud detection, and lead scoring.
Risk modeling for B2B lending
Fintechs are quickly becoming the go-to loan providers for SMBs, due to quicker approvals, less stringent background checks, and no upfront collateral required.
This can attract risky applicants—which means that accurate risk modeling is essential.
Relying on limited data such as banking or accounting statements, does not accurately predict a borrower’s creditworthiness, or the likelihood of them repaying their loans on time. Small business lenders are therefore struggling to cope with increased defaults on loans they have extended.
Alternative data can help enrich a fintech provider’s data, providing a more accurate analysis of the risk levels of loan applicants. Fintechs can use the following types of data to increase loan default risk model accuracy:
- Data on financial activity such as income, borrowing, payment history, assets, and liabilities.
- Company data, including years in operation, company type, and search trends.
- Web presence data like the domain creation date, domain expiry date, number of related links, and website global traffic rank.
- Company credit history.
- Internet and social media data such as online reviews and online ratings.
As a result, fintechs can more easily identify and exclude high-risk businesses, expand the data indicators of risk among SMB borrowers, identify businesses with low risk for immediate automatic loan pre-approval, and create alternative credit scoring models.
According to PwC’s Global Crime and Fraud Survey 2020 , nearly half of all businesses have experienced fraud in the past two years—with online lenders experiencing roughly twice as much fraud as banks.
Online lenders are targets for fraud by borrowers who claim to be legitimate businesses. Compared to banks, which generally make loans to known customers, the online application process is remote and decisions are made quickly. Several online lenders don’t do thorough enough background checks, and rely on basic information such as business and owner name, address, business IP address, and business creation date to make lending decisions. Fintechs need more relevant, alternative data to better assess if a business is real. By enriching their data, lenders can generate more accurate fraud scoring models which can dynamically identify fraud by expanding the number of relevant data points used at the pre-qualification stage.
A few examples of the types of alternative data that can help to improve loan application fraud models are:
- Web presence data, providing information on an applicant’s website such as its global traffic ranking, and number of related links.
- Social media presence data, indicating whether the applicant has a LinkedIn account, how active they are on it, and how many followers they have.
- Owner related data, such as a phone validation score
- Web data indicating whether the applicant’s website has e-commerce functionality or is linked to services such as PayPal.
Alternative data powers more accurate fraud detection models. In turn, we see online lenders boost their fraud detection rate by up to 92% on applications before they even progress from the initial form submission. This dramatically cuts both lending costs as well as charge-offs.
All companies, regardless of their sector, want to target high-quality leads. However, internal data (e.g. form fills, website engagement, and pages visited) reveals little about prospects. For financial institutions and lenders to SMBs, the results of poor lead qualification can have a severe negative impact. Fintechs need to target the leads that are lower risk, and more likely to pay back the loans on time. If not, money could be wasted on recovering defaulted loans.
Enhancing existing datasets with real-time, relevant information enables the creation of new lead scoring models that will more accurately qualify leads and identify those with low-risk profiles. We see our customers create lead scoring models with new signal categories by incorporating third-party financial data such as credit ratings, previous history with bankruptcy, revenue trends, loan repayment history, and assets owned. The result is better targeting, improved conversions, streamlined operations, less time wasted pursuing bad leads, and less money spent on recovering defaulted loans that were extended to incorrectly qualified leads.
A few examples of the types of alternative data that can help improve lead scoring models are:
- Corporate data that includes reliability indicators such as assets owned, liabilities, whether a B2B potential borrower pays loans and bills on time, and their history with bankruptcy.
- Business ratings and reviews on social media and other sites like Yelp.
- Website data such as traffic and global traffic rank.
- Social media presence and engagement data.
The results are wide-ranging: better targeting, improved conversions, streamlined operations, less time wasted pursuing bad leads, and less money spent on recovering defaulted loans that were extended to leads that were incorrectly qualified.
It’s time to unlock data’s true value
Today, fintechs continue to struggle to build accurate predictive models trained with only internal data.
To derive the full value that can be gained from data analytics, they must enrich internal data with alternative data sources. This will power improved business outcomes—enabling them to make more accurate strategic decisions.
To learn more about alternative data for fintech, download our comprehensive whitepaper, ‘The Definitive Guide to External Data for Fintech’.