Amazon Introduces SageMaker Data Wrangler

SageMaker Data Wrangler simplifies the process of data preparation and feature engineering, for machine learning, and lets you complete each step of the data preparation workflow from a single visual interface.

Recently, Amazon at its re:Invent conference introduced Amazon SageMaker Data Wrangler, a new service that makes it easier to prepare data for machine learning training. The company also unveiled SageMaker Feature Store, available in the SageMaker Studio, a new service that makes it easier to name, organize, find and share machine learning features.

Alongside the company also launched Sagemaker Pipelines, a new service that provides a CI/CD service for ML to create and automate workflows, as well as an audit trail for model components like training data and configurations.

Source: AWS

As per Amazon, SageMaker Data Wrangler diminishes the time it takes to aggregate and prepare data for machine learning from weeks to minutes. It simplifies the process of data preparation and feature engineering, and enables you to complete each step of the data preparation workflow, including data selection, cleansing, exploration, and visualization from a single visual interface. 

Data Wrangler includes more than 300 built-in data transformations to help you quickly normalize, transform, and combine features without writing any code. You can quickly preview and inspect your selected transformations in Amazon SageMaker Studio, the first fully integrated development environment (IDE) for ML. When your data is prepared, you can build fully automated ML workflows with Amazon SageMaker Pipelines and save them for reuse in the Amazon SageMaker Feature Store.