Data science’s commercialization over the last decade emphasized the science more than the data. Even as data scientists spent most of their time wrangling and cleaning data, it was their algorithms that gave their organizations a competitive advantage. This industry model had a significant flaw that ultimately caught up with commercial data science, though. Algorithms are just math – incredibly clever and complex math, but still math.
This commoditized the algorithms. A market-leading company’s black box algorithm full of rainbows and unicorns is – or soon will be – fundamentally the same “magic” that their competitors are using.
This opens a significant opportunity for any company that shifts the focus of their data science initiatives from the science to the data.
Explorium’s CEO and co-founder Maor Shlomo highlighted one such company at the Gartner Data & Analytics Conference. “PepsiCo takes a different view and a different approach when looking at analytical challenges. A lot of organizations and data practitioners these days are putting a spotlight on the algorithm. But Michael is saying,
‘What data can we find now that will change the outcome? What are the internal and external data to feed into our models that will make our business decisions better?’”
Michael Cleavinger, Global Commercial Data Science Lead for PepsiCo, spoke alongside Shlomo at the conference.
PepsiCo interacts with over one billion customers every day. Think about how much data is therefore streaming into PepsiCo 24/7. Yet the company still seeks and uses multiple sources of external data.
This gives Cleavinger a unique perspective on why your value, your differentiation, comes from your strategy and from the data you feed into your models.
“Everything we do starts and continues with data. If we don’t have enough data, it fails. If we have too much data, it can also fail.
“One of my pet peeves is when you’re talking to senior leadership and they ask you ‘How big is your database? How many petabytes do you have?’ We get excited about how many petabytes of data we have, and we start focusing on the wrong things, like how do we just grow this information?”
Excess data can not only bury the essential insights in petabytes of noise. It can make it impossible to even find the path that would lead the company to those insights. Cleavinger makes the analogy between data and trees. If the trees in a forest grow densely and chaotically, getting through the forest will require a lot of time, effort and luck, with no guarantee that you’re going to emerge from the forest in the place you want to be. But with the appropriate data, particularly from external data providers, companies can select their trees and plant the forest in a way that allows paths – and the insights along the way – to emerge.
Both cluttered forests and purpose-built forests may not have everything the pathfinder needs to succeed. Not even at PepsiCo’s scale of data acquisition.
“We know we’re missing information because we see it in different places,” Cleavinger acknowledged. “Explorium has a large amount of data, and with that large amount of data we can start looking at what’s useful. Instead of going through all that data we can take our data, match it up with Explorium’s data and then bring across just those features that matter.
“We’re looking across different data sets and then we bring in the data that is most effective and most applicable to the problems we’re trying to solve.
“It takes this huge forest and reduces it down to a trail.”
Explorium has solved the “more is not better” data problem for CPGs, fintech startups and FSIs, insurance companies, media agencies, retailers and many more. This range of industries and company sizes – PepsiCo employs over 250,000 people worldwide – allows Maor Shlomo to identify the trends affecting all of them.
“By now, we’ve seen different stages of enterprises going through this cycle. First, a couple of initiatives around AI and modeling. ‘Let’s throw models at the problem and it’ll probably work.’ Then they get to where they need a really good understanding of the business problem. Once they can check that, what is the data strategy?
“Let’s say an enterprise is looking to predict who is the next best customer. There can be so many data points that could answer that question or that we could fit into the model: data about the stores, on the demographics, on the owners, on the visitors. There’s so much information out there that could be impactful. But if you’re just trying to search for it out of the blue, if you’re reaching out to different vendors, going through legal and procurement, it gets to be an expensive, long and painful process.
“What PepsiCo is doing fantastically well is building a strategy around the data problem.”
PepsiCo recognized the inflection point that is at the core of Explorium’s business: finding the right data is a more important differentiator than the amount of data or the sophistication of the algorithm. Or, as Michael Cleavinger might say, a forest built for paths is superior to hacking your way through the trees, hoping to come out the other side.
Maor Shlomo challenged the audience at Gartner D&A to confront how well their data strategy and business strategy are mutually supporting.
“If you want to be successful like PepsiCo,” – Shlomo concluded, and what business doesn’t? – “you have to get your hands on the right data and experiment and iterate on all that data.”