The art of feature engineering

Many compare feature engineering to an art and, as a result, almost no tools or processes have been developed to streamline it. But this art is more like an impossible task. The world of potential datasets and new features is so vast that data scientists can’t fully explore the range of possibilities, which means losing out on potentially huge improvements to models.

Augmented feature generation with Explorium:
Discover the unknown drivers of your business

Feature Engineering in Explorium includes innovative auto feature generation to explore multiple data sources and the complex relationships between them. Once your dataset is enriched with the data from the Explorium external data gallery, the platform automatically generates a myriad of candidate variables across a wide range of datasets, testing and ranking to offer you the most impactful features for maximum predictive power.

Now you can automatically generate thousands of new features and immediately distill the top performers to drive ROI and impact your business. The platform uses various techniques to generate features, including:

Text analysis and NLP

The platform is able to extract insights and build features from plain text, which includes understanding the subject of the text, text summarization, sentiment analysis, and more.
Time series

Our time series feature generation engine analyzes on a sliding window and extracts features such as seasonality (weekday vs. weekend), trend change points, and holiday impact.
Longitude and latitude coordinates and their relation to other data points can uncover a clear story. Our geospatial feature generation takes into account the curvature of the Earth for far more accurate predictions as well as other effects including population density, local property attributes, and footfall data.
Our engine can efficiently run algorithms such as DBSCAN, K-Means, and other unsupervised machine learning techniques to engineer high-level features from the raw data.

Get creative with your own code

Feature discovery with Explorium is completely extensible. You have the freedom to write your own code (what we call “Creatives”) on top of our rich data gallery and through our feature discovery and generation engine. This way you can integrate your domain knowledge to the mix so no good ideas remain unexamined.
Think outside of the box

Searching for the right data to generate features from is hard, which is why most data scientists often only look at attributes of the data they have. Explorium lets you think outside the proverbial box by cutting out the entire data acquisition and integration process so you can generate features you never thought would impact your model from data you don’t have internally.
Predictable returns

There is always a tradeoff between the time it takes to integrate and test a new data source versus the potential improvement its features will offer to your model’s accuracy. The time investment and unpredictable returns are no longer a deterrent when you’re able to see the impact of new features in minutes rather than months.
Make a measurable impact

With infinite possibilities out there, the sheer task of finding new data to improve your model usually becomes a nonstarter. We don’t have to tell you (but we will anyway): you’re missing out on huge potential. Stop expecting tweaks to your algorithm to give you the improvement you need to show real business value. Instead, allow Explorium to be your secret weapon to improve your models and drive a competitive advantage for your business.

Discover new features from external data sources you never knew existed.

