By now, we’re all getting used to the new (less than ideal) normal in sports — watching our favorite athletes play their games in empty arenas as the coronavirus pandemic continues. However, it’s not hard to think of a time in the recent past when we were lining up for tickets to our favorite arenas and teams’ games. While this was great, it wasn’t cheap.
The price of sporting event tickets has consistently been on the rise for years. From inflation to the massive cost of the stadiums and arenas where sports teams play, tickets today are increasingly pricing out a wide range of consumers who would love to watch games live. This has led to an unsurprising drop in live attendance since most regular customers don’t have the disposable income to drop $400 on taking their family to a single game before considering paying for parking, snacks, souvenirs, and more.
For teams, this presents both a major problem but also an opportunity. On the one hand, sports organizations run nearly billion-dollar budgets that depend on revenues that largely come from ticket sales and concessions sold at their venues. Losing sales with untenable prices is bad business, plain and simple. There’s a clear need to find a better — more dynamic — way to price tickets, but that involves too many variables to sort out, right? You can’t really find all the different factors and include them in your existing models if you’re doing this by hand. Let’s tackle this problem using Explorium to see how enriching your data could offer a quick and easy solution.
The question here is about finding the optimal price, but not just based on average ticket prices around the league (let’s assume, for this exercise, that we’re focusing on the NBA). It’s about understanding how much customers are willing and able to pay and finding a price point that meets those conditions and still gives your organization profits off each ticket sold. It’s about using data to boost your ROI.
The goal: find the features that result in the best predictive pricing models
Before going further, let’s take stock of the data we had to begin with (going by the columns in the initial database):
Based on this, you could assume you already have a fairly robust internal dataset. However, this data can’t really account for several key variables that might give your pricing model a much-needed uplift.
So before we even build a model, let’s simply focus on getting the data you need into your dataset. This enrichment takes a few minutes on the Explorium platform and requires your internal data and a single click.
Let’s see what new data Explorium added to enrich your initial dataset.
Once we’ve enriched your data, Explorium starts doing the hard work of actively boosting your ML models.
The first part of this goes beyond adding sources. Once the dataset is matched with as many sources as possible, Explorium proceeds to find those most relevant to you and rank them before continuing to the feature engineering process. On the platform, you’ll be able to see the most relevant sources available, which in this case include:
From here, you can choose which sources to include or exclude, or even let Explorium choose for you. In the case of your hypothetical NBA franchise, here are some of the most relevant:
Once a feature list is built, optimized, and ranked, it’s time to train your models to see how well the dataset performs. In our example, the data was able to provide an R2 score of 68.21.
Compared with your internal dataset, which scored a whopping 4.77, Explorium gave you an impressive 1330% uplift (it’s worth noting that, while this is undoubtedly a great result, usually uplifts are more within normal ranges).
More importantly, however, you can see how a variety of models perform instead of testing each once at a time. The best results come from a standard XGBoost model which used a few unique features, including:
Next, a random forest model (with eight iterations) scored a 66.23 R2, using unique features that include:
From here, the next step is to see the insights you can glean from the training and test models. Explorium lets you view an insight tree to see which combinations of features lead to the most accurate predictions, as well as see which features contribute the most to your predictive pricing models. Additionally, you can identify meaningful combinations of variables, and even compare different models to determine which is best for your specific need.
Once you’re satisfied with your insights, you can continue to run predictions based on the models you choose, and can even set scheduled predictions. Using your XGBoost from above, let’s try to run a prediction.
The results are even better than we could have imagined, with an R2 score of 68.21. You can also check your model’s performance in terms of absolute percentage errors (MAPE, MdAPE, and SMAPE). From there, all that’s left is to deploy your model and start making predictions.
If you’re keeping score (pun intended), Explorium just let you build a model to predict ticket prices in under 15 minutes, including connecting you to thousands of external data sources, refining your datasets, and even building you hundreds of possible features to use in a variety of models (which it also tested in parallel).
Even better, this is only a small sample of what Explorium can do. Thanks to our powerful AI-driven data enrichment engine and feature generation tools, Explorium can build models for a variety of predictive questions and use cases.