August 2022 Data Release Notes
Whew, what a quarter! With all this heat, we decided to stay indoors, crank up the AC, and work on expanding our data catalog. We’ve made some big upgrades to some of our most-used datasets and built a few new ones as well. The details are below (and we find they pair well with a nice cold beverage). Now, on to the good stuff.
New Dataset Releases
Contacts for professional individuals
This premium dataset provides contact information for individuals living in the US, such as email addresses, phone numbers, social platform URLs, and more. This information is especially useful in providing up-to-date leads for marketing purposes, especially when a business contact information is the contact information of an individual, such as the case with sole proprietor businesses, SMBs, and more. This data could also be used for direct-to-consumer marketing purposes.
- This data is sourced from several leading data vendors that sell cutting-edge, reliable data. Proprietary signals have been created from these sources.
- Freshness: dataset is updated monthly
- Contains records on over 300M individuals
- The dataset can be queried with any combination of personal identifiers associated with the individual such as email, phone, name+location
Business contact directory
The goal of this enrichment is to provide the Contact Information of employees in a business for marketing user flows. This enrichment returns the top 1,000 contacts associated with the input company, scored by their job title level and the presence of contact information (one-to-many).
- Supported data classification: URL
- Freshness: updates quarterly
- Coverage: companies worldwide
- Contact information is available for people residing in the US
- Includes records for 170M employees across over 2.6M companies
Company funding and acquisitions
This dataset highlights information regarding the funding, acquisitions, and IPOs of over ~300K private and public companies.
For every company, up to 10 of the most recent transactions for each transaction type are displayed in separate rows (one-to-many). When enriched data is grouped by the provided ‘Transaction date’ column, provides a chronological snapshot of the company’s capital. Each transaction type includes additional information regarding the transaction size, names of the second parties involved, and more.
For example: for a ‘Funding’ transaction type, a row will display the company’s funding round, names of investors, the amount of US dollars raised, and more.
- Supported data classification: Domain, URL, Date, and more.
- Freshness: updates quarterly
- Coverage: companies worldwide
Distribution by company size:
- 1-10 employees: 44%
- 11-50 employees: 31%
- 51-100 employees: 7%
- 100+ employees: 18%
- Software: 196,897 companies
- Information Technology: 180,631 companies
- E-Commerce: 105,122 companies
This email verification dataset indicates an email address’ validity and how risky the email address is deemed to be. This data is mostly used to ensure your email lists are reliable and up-to-date and assists in determining if an email address is worth mailing.
- Sourced from an email validation industry leader.
- API covers emails globally
- Top recommended signal: email’s validation ‘Status’
- Used for lead generation in marketing use-cases
The key result for using this dataset is reduced email bounce rates
Foot traffic analytics
Foot traffic provides visitation data of mobile users to H3-9 locations, which are in the area resolution of 0.105 square kilometers. This data is used for various use cases, such as site selection, trade area analysis, investment research, SMB analysis, lead scoring, and more. In this data, a mobile user is counted once per day. Meaning, that even if the mobile user left the premises and returned, their visitation to the area will be counted once. The most recent data available is for 7 days previous to the current date, and if a date is not selected for the signals providing information on specific dates, the service will only return the latest trends features.
- Sourced from an industry leader in location analytics.
- Data is updated weekly
- Covers normalized foot traffic data in the US and raw global data
- Enrichment is possible with/without an input date.
- Top recommended signals: Average visitors per week, Yearly visitors trend, Monthly visitors trend, Total visits over 2 weeks
Real estate essential data
This dataset contains key insights into property valuations, tax assessments, transactions, ownership, characteristics, and much more on 130M US properties. From lot size to owner status, this data can enhance the marketing abilities, or provide a more accurate risk assessment of owners or potential buyers.
- Sourced from an industry leader in real estate data.
- Matching is affected by the input dataset’s address, e.g if the address contains a unit or house number. Read more here to understand why Coverage vs. Accuracy – Entity Resolution on Companies
- Geocoding: city, state, zip code, lat, long, etc.
- Ownership: owner name, entity type, does owner occupy the property, is it owned by a company (and company name), most recent sale date and amount, and more.
- Valuation & Tax: market value assessment, the market value of land or improvements, tax assessment, and more.
- Property: building year, property type (single house, condominium…), lot size, number of rooms, and more.
The Firmographics dataset provides essential information on global companies, including revenue range, company size, industry codes, and more. This data source combines and cross-checks information between multiple sources of data to provide as much coverage and accuracy as possible.
- New signal: Company’s number of locations is now available in the dataset. This signal indicates how many physical locations exist under the same company (Not the ultimate parent – e.g. Google != Youtube).
- New signal: Google business category is now available in the dataset. The signal provides the classified business category assigned by a business page owner on Google.
- Coverage boost: upgrades were made to the Industry classification signals including the NAICS, SIC, and LinkedIn category signals. Most entities that have one of the 3 signals, will have an autocomplete value for the other 2.
- Coverage boost: 8.5M new business entities were added to the dataset, which now has a total of ~130M entities. Customers should expect higher coverage on SMBs in the US, and higher fill rates on NAICS codes & URLs.
- Matching improvement: the mechanism for parsing addresses inserted as free text to the ‘Address’ ontology was upgraded. Addresses that are broken down into columns by street, city, and state will still yield optimal results.
Generate studio additions:
An internal source unifying contact data from a variety of sources is now available in the Generate studio and has increased coverage by 20M records. This addition immediately impacts the availability of contact info for lead generation for both individuals, and businesses.
- This data is limited to people living in the US
- Freshness: data is updated quarterly
Improvements were made to the dataset’s stability and performance, improving coverage to 87%-92%.
Consumer habits by location:
Datasets from the ‘Consumer habits by location’ series were improved and fixed, following feedback on Zipcode location matching. Improved datasets include:
- Consumer habits by location: credit and bank cards
- Consumer habits by location: interests
- Consumer habits by location: investments and assets
- Consumer habits by location: product purchases
Explorium’s data onboarding process [Whitepaper]
This white paper aims to answer frequently asked questions regarding the data onboarding process: how Explorium vets sources, what our due diligence process is, at what stages Explorium validates data, and much more. From research, evaluation, onboarding, development, delivery & monitoring, the diagram provides an overview of the many stages involved in productizing data. By highlighting the many areas of expertise involved in procuring third-party data that Explorium excels at, our unique value proposition is undeniable. Find it here.