Table of Contents
Data marketplaces provide access to data from a variety of different sources and providers. This article will go over some of the benefits and limitations of data marketplaces, and how an external data platform can help address these limitations.
What is a Data Marketplace?
A data marketplace is exactly what it sounds like; an online store that facilitates the buying and selling of datasets from several different sources. It is a data exchange; a platform that connects data buyers with data sellers (providers). Data marketplaces offer individuals and businesses the ability to upload or access data, generally in the cloud. More cloud-based marketplaces are showing up as companies develop a greater understanding of the value of enriching their internal data with third-party data. They provide a more public (and often commercialized) form of data sharing. They are helping to reduce the effort involved in locating required datasets. As businesses generate more data internally, or collect external data through activities like web scraping, data marketplaces enable data monetization. For example, an AI software platform could purchase data from a marketplace to train and sell their AI-based models.
Why use a data marketplace?
Organizations today are aware of the competitive edge they can gain by incorporating external data into their data strategy and business models. Data marketplaces provide access to a variety of big data sources. Verified data marketplaces require all users to meet relevant KYC and security requirements and also support financial security in facilitating high-value transactions. Commercial datasets can cost hundreds of thousands of dollars, therefore businesses need to ensure that the platform is trustworthy. Data marketplaces also host multiple vendors and buyers on one platform – the options are vast.
What kind of data can you get from a data marketplace?
Some examples of the types of data for sale in a data marketplace:
- business intelligence (BI)
- market research
- public data
There are different types of data marketplaces that sell different types of data:
Personal Data Marketplace
Individuals can monetize their own data by selling it to a personal data marketplace. They can share demographic data, shopping preferences, location, and more. Personal data marketplaces are fully GDPR-compliant since people are sharing their data willingly.
B2B Data Marketplace
This type of data marketplace collects and stores company data from a variety of data providers onto one platform. It allows companies to access an aggregate of pre-curated information, which they can use for marketing, sales, and BI.
Sensor/IoT Data Marketplace
It is possible to sell IoT data to third parties. In this type of marketplace, organizations can buy or sell real-time data that is collected from IoT devices, which helps them understand consumer behavior, improve sales, and build better marketing strategies.
Data marketplaces aim to build an ecosystem of data providers and data consumers. As you can see, the range of data in data marketplaces is broad; you can buy any type of data such as consumer behavior, financial, geospatial, and technology stack data, depending on the marketplace you choose. Data types can be mixed and structured in a variety of ways. There are many different data providers selling data in marketplaces. Providers such as Duns and Bradstreet and Zoominfo offer behavioral information collected from online activities, which helps organizations with segmentation and targeting. Zoominfo also provides employee contact information. Data providers such as Crunchbase and DueDil provide company financial data. Datanyze and G2 Stack provide technographic data, which helps SaaS providers know which tech stacks that their prospects are using. This helps with targeting. Foursquare and Factual provide location data, which marketers can use for geotargeting.
What are the limitations of data marketplaces?
Timely, high quality data is elusive. A Forrester Consulting study found that 99% of firms surveyed faced issues with customer data, and 96% indicated that timelines and accuracy issues with customer data acquisition were big problems. Figuring out what data is needed before purchasing it can be difficult.
Data marketplaces provide data access. A data marketplace is where a company could purchase a dataset to incorporate into their business models. However, the ability to purchase data is only one aspect of the bigger picture when it comes to a data strategy. The purchase of a dataset doesn’t guarantee the desired business outcome. Before purchasing data, there are several things to consider. Data marketplaces provide the platform that facilitates the purchase of data, but do not help with the steps required before and after the purchase. These steps are equally, if not more important, when it comes to incorporating datasets into business models, especially considering the pricing of the datasets. In most cases, organizations need more than just the data, but the right tools to help with the rest of the process.
The main limitations of data marketplaces are:
1) Integration with internal data
One major consideration when purchasing external data is how to integrate it with internal data in order to build more accurate predictive models. This can be a challenging process and will require separate tools, platforms, or data science teams to handle. The cost and time of integration can add up.
2) No guaranteed ROI
There is no way to tell what value the purchase of a dataset will bring to an organization prior to purchasing it. There is a high cost of purchasing, formatting, handling, managing, and integrating external data with internal data. The cost is even higher if after all that, there is no ROI or boost to predictive models.
3) Finding the most relevant data isn’t obvious
There is a vast amount of data available for purchase. How can organizations determine which will be the most relevant to their business problem or use case without testing it out first? There are no “free trials” in data marketplaces, meaning that you can’t try out a dataset on your model to see if it works prior to making the purchase. Vast options aren’t always helpful for those that don’t know what they are looking for. Options are beneficial when organizations understand the data they need. However, many organizations do not and could end up spending the money on purchasing a dataset that ends up being of no use to them.
4) Purchasing more than one dataset can get very expensive
Purchasing commercial datasets from data marketplaces can cost hundreds of thousands of dollars. Organizations might need to purchase more than one dataset for different models and use cases. The costs add up quickly, in addition to whatever data marketplace administration or subscription fee is required to facilitate the transaction.
5) Matching, or useful data formats
The data that an organization purchases from a data marketplace might not always be in a relevant format. This means that it might not match the format that the organization’s data is in and will need to be reformatted – another extra cost in time and money.
6) Security and Compliance
Not all data marketplaces offer secure and data compliant with regulations such as GDPR and CCPA.
What is an external data platform and how can it help?
Understanding the data that you need, and being able to integrate it with the data you have is where a data science platform comes into play. Specifically, an end to end external data platform, which provides one platform for accessing all of the relevant external data sources, understanding their impact on data analytics and predictive models, integrating them with internal data, and deploying more accurate predictive models. An external data platform provides access to all of the relevant external sources (available across different data marketplaces) in one place. It also provides an understanding of which data signals you need and the ROI they will drive.
The data acquisition and predictive model deployment process can be laborious when you are using several different data providers, tools, and platforms. An external data platform aims to resolve this.
When considering the pricing of datasets in data marketplaces, why not also consider getting the data access and the data science platform all in one place?
The data acquisition process:
Step 1: Define the business problem.
Before purchasing data, it is important to start by defining the problems that need to be solved. Is there a specific question that you need to answer (for which you need new, better, or complementary data) or are you struggling to optimize an existing solution? The first step is to look for relevant data that helps you uncover insights. External data such as foot traffic, pricing, firmographics, technographics, and other marketing and financial attributes can improve predictive models and business outcomes. Defining the problem you want to solve will help you better understand what type of data you need.
Step 2: Find relevant data.
The next step is finding the right data. There are hundreds of thousands of premier and public data sources. Searching and evaluating data for your use cases could take time. Timely access to the data is key to organizational agility. After finding the source that has the data you want to purchase, it needs to be evaluated for its data quality, coverage, gaps, recency, frequency of updates, risks, and more importantly, relevance to your business context.
Step 3: Determine ROI prior to purchase.
Another key step in the pre-purchasing process is to determine the ROI. Depending on the use case, the expected outcomes, and the cost of acquiring the data, it is important to understand what the return will be. In many cases, you cannot determine the value of a data source until you use it in a real-life scenario. Ensure there is a way to evaluate the impact of the data and gain an understanding of the uplift in machine learning models and feature explainability, before deploying predictive processes.
Step 4: Data formatting.
When acquiring new data, it is often not consumption ready. The data preparation process may include data cleaning, data transformations, and configuring data pipelines for data consumption.
Step 5: Integration.
After acquiring and preparing the external data, the next step is to evaluate how you are going to consume the data within your existing analytics or machine learning platforms. There are various connection options to check for. See if you have the access to the data via export, API, and connection capabilities with storages such as S3 or Snowflake. Also plan for connections to platforms or applications such as Google Big Query, Databricks, and Salesforce, according to your business needs.
Step 6: Model deployment and monitoring.
Once the new model is deployed, model performance or drift should be continuously monitored to see if it is still performing according to business expectations. Make sure you have the right data onboarding frequency to keep your data current, accurate, and relevant. In case of any signal loss (such as loss of third-party cookies) you should immediately seek alternative data sources for your business continuity.
Ongoing: Privacy and Compliance
Privacy regulations such as GDPR and CCPA make external data management harder than ever. Check external data for compliance, safeguarding policies, and incorporate best practices to take the risks out of putting data to work.
An external data platform helps with every step of this process from data discovery, to acquisition, preparation, integration, model training, compliance, deployment, and model retraining. Organizations today understand the importance of data assets. However, leveraging the right data goes beyond simply purchasing a new dataset. Having one platform to assist every step of the data journey will help leverage the benefits of external data more efficiently and effectively.
Explorium provides the first External Data Platform to improve Analytics and Machine Learning. Explorium enables organizations to automatically discover and use thousands of relevant data signals to improve predictions and ML model performance. Explorium External Data Platform empowers data scientists and analysts to acquire and integrate third-party data efficiently, cost-effectively, and in compliance with regulations. With faster, better insights from their models, organizations across fintech, insurance, consumer goods, retail, and e-commerce can increase revenue, streamline operations and reduce risks. Learn more at www.explorium.ai