Wiki Categories

Model Evaluation

Data Wrangling

Data Wrangling, also known as "data preparation" in Self-Service, is the process that allows raw data to be discovered, structured, cleaned, enriched, validated, and published in a format suitable for data analysis.

Why is data wrangling important ?

Since a "wrangler" is essentially a cowboy, the "data wrangler" can be seen as the cowboy of data, trying to collect scattered data in the same way that a cowboy gathers or sorts cattle. The term wrangling, suggests that the activity is laborious, unpleasant, and tiring; but that it must necessarily be performed in order to get the job done properly.

Following data wrangling, data volumes can be reduced significantly. This task represents 80% of the time spent by IT teams, business analysts, and data scientists to manipulate, transform, and prepare data.

 

Additional Resources:

Explorium delivers the end-game of every data science process - from raw, disconnected data to game-changing insights, features, and predictive models. Better than any human can.
Request a demo
We're Hiring! Join our global family of passionate and talented professionals as we define the future of data science. Learn More