Many organizations understand the value of external data, yet struggle with external data procurement. To remain competitive, organizations need a strategy for purchasing external data in order to derive the maximum value from it.
Our 2022 State of External Data report aimed to discover how organizations use external data today and understand the challenges they face when building their data strategy.
We asked respondents to report:
This blog will highlight the survey's key findings and the processes and skills organizations have to find, ingest, onboard, and integrate external data.
We surveyed 223 data leaders from companies across the United States in various industries about using external data and some of their challenges. Of these, over a third said they worked in analytics and BI (39%) and data management (36%).
Most respondents (64%) held either director or executive-level positions in their organization and said their annual external data spend spiked between $100,000 and $500,000 (18%), and again at over $1 million (13%).
The survey also revealed respondents primarily purchased company data (52%), with the next most common data types being demographic (44%) and financial data (41%). At the same time, 44% of respondents revealed they purchase external data from five or more paid or public sources.
A company's ability to compete is increasingly driven by how well it can leverage data and apply analytics. Companies can build more accurate predictive models to improve decision-making processes by enriching their internal data with external data. In our survey, just 40% of survey respondents said they have a formal external data sourcing strategy in place.
While this is an improvement over last year's result of 28%, it's still clear that most organizations tend to find data on an ad hoc basis. They might have general guidelines with no consistent strategy, or outsource data procurement altogether.
With the volume of big data available, it is hard to know what to look for and which data sets will add value. Overlooking external data is a missed opportunity as it provides essential context not always captured in internal data, so successfully leveraging external data creates a competitive advantage.
Our survey respondents highlighted several challenges in sourcing external data: regulatory constraints (34%) and difficulty understanding the ROI of some datasets (34%).
Additionally, 63% of survey respondents report the most common barrier to using external data is data prep and integration, but lack of strategies to address usage and risk (34%) and lack of process for data usage (32%) also rank in the top three usage inhibitors.
Any purchased datasets from third-party vendors must be managed carefully to comply with data usage policies such as the EU's General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA). Organizations must ensure that data vendors are providing compliant data. This is also further complicated when managing multiple external data sources and vendors.
Even though most companies understand that augmenting their internal data with alternative datasets will provide better insights, it can be hard to know why it is important. It is tough to make the case to spend the money on external data without knowing exactly what to purchase and how to use it.
As for why organizations purchase external data, findings from the survey revealed leading usages in business intelligence (69%), data science (56%), and visualization and reporting (55%).
The most common use cases for external data include financial risk analysis (47%), customer segmentation (43%), fraud detection (36%), and demand forecasting (35%). As the value of external data becomes more apparent and is used to create competitive advantages, we expect the number of use cases to increase significantly as organizations seek out more ways to leverage it across their key business initiatives.
Fifty-two percent of the respondents said they need to easily match and integrate internal and external data to maximize the value. They also need to ensure data quality and accuracy (49%) and a better way to locate the right type of data (46%).
Most of the time, collecting data is operational collateral. So there is little value in data beyond validating decisions already made or confirming that "they're going in the right direction." Data leaders need to find a way to prove, tangibly, that embracing data and investing in better ways to use it has true value.
The key takeaway here is that organizations need to seek new technologies to address the challenges with data preparation and integration and expand their use of external data.
One way to do that would be to leverage an external data platform. External data platforms work by automating your connections to thousands of pre-vetted data sources. Not only have the datasets been curated for quality and reliability, they form a single, collective catalog, so you don't have to pay for access to each one separately.
The datasets are inter-compatible, meaning that you can treat them as a single resource, lifting out just the details you need to enhance and augment your existing datasets or combining them into brand new datasets.
An external data platform should remove as many of the roadblocks associated with finding and purchasing data sources as possible. It should also simplify your data pipelines and get your predictive model development off to a great start. That's where the true value lies.
To learn more, download the full 2022 State of External Data report here.
Explorium offers a first of its kind end-to-end automated external data platform for advanced analytics and machine learning. Our unique all-in-one platform automatically connects and matches internal enterprise data with thousands of relevant external data sets to accelerate your ML investment, ROI, and help solve complex problems. The Explorium platform empowers data scientists and business leaders to drive decision-making by eliminating the barrier to acquire and integrate the right external data and dramatically decreasing the time to superior predictive power. Visit explorium.ai to learn more.
References: