The Guide to External Data and Machine Learning for Consumer Insights
How the right external data can fuel machine learning for consumer insights.
The consumer goods industry has changed significantly in the past few decades, fueled by increasing technological development and constantly changing consumer habits. The pandemic disrupted supply chains and upended the retail market, forcing many smaller shops and convenience stores to close, move, or change ownership. In the meantime, enterprising new stores have launched, and consumer buying preferences have shifted to online channels.
Not only do consumer goods companies need to maximize opportunities with existing B2B channels such as retailers, independent grocery stores, convenience stores, and corner shops, they also need to expand and grow new direct-to-consumer (D2C) channels as consumer buying behavior shifts increasingly online. Getting access to the right data and customer insights is critical to create the best customer experience. Consumer goods companies can no longer rely solely on their own historical customer data.
To get valuable consumer insights, understand the market, and make important business decisions, marketers rely on the information that is available to them (ad clicks, website views, chatbot interactions, newsletter sign-ups). As traditional retail channels change, there are also major shifts happening on the customer journey. With these rapid changes, most marketers are left operating with outdated analytics models based on the limited information they have on hand.
Many organizations are going through a digital transformation, including shifting to artificial intelligence and machine learning to help them build more accurate predictive models. The quality of a predictive model depends on the data it is trained on – garbage in, garbage out. If a model cannot provide accurate, actionable insights, then the time and money it took to build is wasted. Most organizations rely on their internal data to build predictive models, but is it enough? Consumer goods brands need to bring in external data to refresh and augment their datasets and to renew their understanding of market conditions. However, with the amount of big data available, it can be difficult for organizations to understand which datasets will be relevant to help them solve their business problems. This article will examine how consumer goods companies can leverage external data to build better models, successfully navigate the rapidly evolving industry, and achieve their business goals.
What is external data?
Thorough market research is done outside of the four walls of the organization. Of course internally generated data matters, but to gain a real understanding of the industry and market segments, organizations must look externally. Understanding customer behavior is an essential component in the decision making process. Using external data can help companies get a better understanding of customers and prospects outside of their interactions with the brand.
Internal data is what comes from your Google Analytics account, CRM, or accounting software—it comes from a company’s own sources. External data, on the other hand, means: “… any type of data that has been captured, processed, and provided from outside the company”. Typically external data is obtained in data marketplaces or by partnering up with an external data provider.
Some examples of external data include business ratings and reviews, social media data such as LinkedIn followers and engagement, geospatial/location data (zip code population, nearby competitors, nearby vacancies), and point of interest data. Leveraging external data helps organizations build better customer profiles which feed into creating better sales and marketing strategies and marketing campaigns.
What kind of external data do you need?
In the post-pandemic world, where the historical consumer data fueling lead generation and scoring models is no longer effective, organizations are increasingly powering machine learning and analytics models with external data to open up new doors and help bring in new customers. External data such as foot traffic, nearby points of interest, reviews, ratings, business filings, social media profiles, median incomes in relevant zip codes, and website traffic offer additional data signals that can help consumer goods companies identify the right prospects and prioritize them for their sales force.
By integrating internal data with external data, companies can enrich their existing insights, fine-tune their operations, and unlock further growth.
What kind of external data do consumer goods companies need? Here are a few examples.
- Age distribution
- Average income
- Population density
Purchasing and Spending Behavior
- Likelihood to buy certain products
- Spending patterns
- Spending information by industry: music, sports and leisure, travel, beauty, apparel and more
- Online spending habits
- Credit card usage
- Number of businesses nearby by category
- Foot traffic information: number of visits in the area, places people visited on the same day, etc.
- Tourist attributes: Hotels and attraction characteristics (number of hotel rooms in an area, etc.)
Review Based Information
- Business rating and reviews
- Health score
- Average pricing of businesses in area
External data can be used for understanding risk, lead scoring, customer segmentation and personalization, creating tailored messaging, and more. Ultimately, the type of external data you need will depend on your use case.
Using external data to build better predictive models
Lead Data Enrichment
A major challenge for consumer goods companies is identifying and prioritizing new customers due to the huge amount of potential targets – independent grocery stores, convenience stores, corner shops, beer/wine/liquor stores, restaurants, cafes, bars, etc. In fact, there are over 1 million restaurants, 150,000 convenience stores, and 45,000 beer/liquor/wine stores in the USA alone. In many cases, CPG firms have very little guidance on which potential customers they should target. The internal data captured — if available at all — is often just the business name and address. This is where external data comes into play. Lead scoring models (systems that analyze attributes about each new lead in relation to the chances of that lead actually becoming a customer, and use the analysis to score and rank all of the potential customers) need more information to accurately determine which leads are of higher quality and more likely to become profitable customers. External data such as demographic and socioeconomic information by zip code, number of nearby competitors, footfall traffic, and business reviews offer important context and provide more accurate insights. Training a lead scoring model using external data helps companies more accurately predict how likely a prospect is to convert to a paid customer, helping field sales teams understand which targets are worth pursuing. This accelerates the sales process, enables sellers to build better relationships with customers, and ultimately increases revenue.
As more consumer goods companies establish online marketplaces, retailers are facing more competition and pressure to provide seamless customer experiences. One of the biggest challenges they face is understanding who their customers are and why they buy. Data scientists and market researchers use data to look for patterns, which can help inform new segmentation strategies based on predicted behaviors.
Using external data in conjunction with internal data and website analytics provides a deeper understanding of customers’ needs. This will enable you to more effectively communicate with, and target, both existing customers as well as new ones. Augmenting your internal data with external data — such as demographic information and socioeconomic information by zip code, number of nearby competitors, footfall traffic, business reviews and number of nearby vacancies — provides more accurate insights into who your customers are and what captures their attention. The ability to build more comprehensive customer profiles based on more accurate segmentation can facilitate personalization (providing customers with customized experiences).
Read our customer’s case study to learn more about using external data for customer segmentation models.
Enhance pricing and promotion strategies
The consumer goods industry is highly competitive with retailers not only competing with brick and mortar stores, but with online stores as well. In such a crowded space, consumer goods companies can stand out by launching dynamic pricing and promotions for their products. However, pricing and promotion strategies are not universal – what works with one demographic might fail miserably with another. It is easy to spend time and money launching campaigns that don’t convert
To improve the impact of its promotions, one of our customers created a promotion scheduling model that incorporated external data. With this new enriched data, the company could identify a variety of key indicators, including:
- Use of the word “Coupon” in search engine results based on specific products or search terms
- The percentage of married couples making purchases at specific stores.
- The number of stores in the same segment within a defined area, to understand the level of competition
- The median income in a specific neighborhood, which can help identify the right price point for specific promotions and product types
By enhancing pricing and promotion strategies with external data sources, consumer goods companies can create targeted strategies with higher likelihood to succeed.
The value of external data in creating more accurate machine learning models
Organizations today understand the impact external data has on machine learning and predictive analytics models. Consumer goods companies that leverage external data can optimize their lead scoring, promotion, and audience segmentation models. Using internal data alone can only provide a limited view of potential customers and their habits.
However, securing relevant external data to fuel predictive models can be a challenge. Typically, companies acquire external data from data marketplaces. Data acquisition is a process and data marketplaces only help out with the data access step. With the amount of data available for purchase, companies might have a hard time understanding or discerning what data they actually need. Once they know what data types their use cases require, they need to assess the different vendors and the quality of the data. Typically, after purchasing data, it will need to be cleaned, prepared, and reformatted before it is ready for consumption in machine learning models. This whole process has a high cost in time and money. There are several questions that companies need to ask before buying external data.
There are solutions that help with the arduous process of external data acquisition. For example, an External Data Platform should help with every step of the process from discovery, access, preparation, integration with internal data, to model training and deployment.
Consumer goods companies can leverage predictive models trained on relevant external data to gain a competitive edge. To learn more, read our white paper: “External Data for Consumer Goods: The Definitive Playbook”
Explorium provides the first External Data Platform to improve analytics and machine learning. Explorium enables organizations to automatically discover and use thousands of relevant data signals to improve predictions and ML model performance. Explorium External Data Platform empowers data scientists and analysts to acquire and integrate third-party data efficiently, cost-effectively and in compliance with regulations. With faster, better insights from their models, organizations across consumer goods, fintech, insurance, retail and e-commerce can increase revenue, streamline operations and reduce risks.
Learn more at explorium.ai.