Explorium's Data Insights Blog
Where data, marketing, and sales professionals come together
How I Deduped Companies in 7 Lines of Python
If you’re dealing with data, you know that data quality is key to any successful project. Data deduplication is one of the most essential steps in ensuring data quality. In this blog post, I’ll show you how I used Explorium’s API to deduplicate company names in 7 lines of Python code. Explorium’s API returns a […]
How Explorium can help businesses find their best customers
In today’s ultra-competitive marketplace, companies are searching for ways to quickly grow their businesses. Many organizations adopt a data-driven approach that attempts to extract maximum value from their data resources. A problem they often face is obtaining viable external data that can be used productively to further business objectives. An illustrative example of Explorium’s power […]
Unlock the World of External Data with Explorium’s API
Leveraging external data can be a complex, time-consuming and costly endeavor. Finding the right data sources, understanding the attributes that matter, cleaning, matching and integrating it with your internal data all require significant effort – that can be hard to justify without the assurance of a tangible return on investment. Introducing Explorium’s new Data Bundle […]
Explorium Recognized in Four Gartner® Hype Cycle™ Reports and Two Gartner® Emerging Tech Impact Radar Reports
Today, we are proud to share some exciting news. Explorium has been identified as a Sample Vendor in four 2022 Gartner Hype Cycle and two Emerging Tech Impact Radar reports. Explorium was also mentioned in the Emerging Technologies: How Intelligent Applications Are Using Alternative Data and Algorithms, 2022 and The Future of Data and Analytics […]
Explorium updates – December 2022
The holiday season is almost here, but the pace at Explorium continues to be high. Here are the latest product and data developments designed to make your work smoother and more focused. Explorium’s Hubspot connector – Enrich leads and develop new prospects from ICPs The Explorium Hubspot connector integrates the data you’ve already collected in […]
Running dbt in production with Python UDFs
dbt and databricks regularly prove a potent combination of SQL transformation tools for the Spark warehouse. Add in the advantages of Delta Lake tables like the atomicity of write, and the “3D” combination of dbt, databricks and Delta emerges as a robust tool for data engineering and a core part of a modern data stack. […]
Data standardization lets datasets and users speak the same language
“Data standardization” means different things in different branches of the machine learning and data engineering world. We define data standardization as the process of transforming different representations of the same data into a single representation. For instance, let’s imagine a customer’s dataset about various companies, and the dataset includes information about which country each company […]
Explorium’s Hubspot connector: Enrich leads and develop new prospects from ICPs
Explorium’s Hubspot connector brings the external data cloud within a few clicks of any Hubspot activity. Explorium’s latest connector lets users access the external data from within Hubspot, returning enriched objects and new records from the external data cloud to the CRM. The Explorium Hubspot connector integrates the data you’ve already collected in your Hubspot […]
Optimizing slow Group By aggregations in Spark: From 20 Hours to 40 minutes
Apache Spark is a very popular engine for running complex distributed data pipelines. Sometimes when using Spark, we need to tune our logic in order to get the best performance. That process sometimes reveals Spark’s “inner workings.” At Explorium, we learned about Spark’s EXPAND command while investigating a query over 1 billion records that failed […]
Explorium’s External Data Cloud now available on Salesforce AppExchange
Salesforce’s customers can now feed unique, powerful data signals, enrich accounts and contacts, and give the sales team the best ammunition as they target accounts, all without ever leaving their Salesforce dashboard. External data should enhance and refine what’s already in your CRM. But at the same time you want to make sure that only […]
Product Release Notes – Nov 2022
Read about our newest platform enhancements in this blog post which reviews recent changes to the way you work with the Explorium platform. You’ll be pleased with new options and features that give you more control in defining your data needs. The Generate flow Improved Generate experience: A newly interactive filter panel allows users to […]
Data Release Notes – November 2022
Leaves are falling, the clock’s turned back, but the pace at Explorium is upward and onward. Here are the latest data developments designed to make your work smoother and more focused. Firmographic Data Explorium’s company identifiers – BETA For many use cases you need to verify that the company you’re looking at is the right […]
Feature Engineering – A Complete Introduction
What is Feature Engineering? Feature engineering is the process of improving a model’s accuracy by using domain knowledge to select and transform raw data’s most relevant variables into features of predictive models that better represent the underlying problem. Feature engineering and selection aim to improve the way statistical models and machine learning (ML) algorithms […]
What is External Data
External data (also known as “alternative” or “third-party” data) is any data that an entity acquires beyond its four walls. Most companies are aware of the value they can gain from their own internal data such as consumer demographic information and purchase history. However, no company possesses the perfect, entirely comprehensive dataset that contains all […]
Demographic Data
What is demographic data? Demographic data provides statistical information about a specific population, including age, gender, race, and location. Companies leverage it to learn about demographic trends and build demographic profiles to help with create more accurate customer segmentation, driving better decision making around lead generation and sales campaigns. Where does the data come from? […]
PepsiCo and Explorium: Accelerating Insights with External Data
Data science’s commercialization over the last decade emphasized the science more than the data. Even as data scientists spent most of their time wrangling and cleaning data, it was their algorithms that gave their organizations a competitive advantage. This industry model had a significant flaw that ultimately caught up with commercial data science, though. Algorithms […]
Technographic Data
What is technographic data? Technographic data (also referred to as technographic segmentation) is the analysis of the technology stack a company uses, including hardware, software, and applications. This data also may include information about when the companies purchased their specific technologies (tools and applications). Companies use this data for market segmentation to identify target markets […]
Website Traffic Data
What is website traffic data? Website traffic data provides comprehensive information and metrics on websites and their number of visitors, number of users, number of clicks, duration of the visits, how they reached the sites, their search intent, bounce rates, conversion rates, and the trends derived from all of this information. Measuring site traffic, which […]