Learn About Azure Data Factory

Azure Data Factory is a cloud-based data integration service that allows you to create data-driven workflows in the cloud for orchestrating and automating data movement and data transformation.

Introduction

Azure Data Factory is a cloud-based data integration service that allows you to create data-driven workflows in the cloud for orchestrating and automating data movement and data transformation. 

Using Azure Data Factory, you can do the following tasks:

  • Create and schedule data-driven workflows (called pipelines) that can ingest data from disparate data stores.
  • Process or transform the data by using the compute services, such as Azure HDInsight Hadoop, Spark, Azure Data Lake Analytics, and Azure Machine Learning.
  • Publish output data to data stores such as Azure SQL Data Warehouse for business intelligence (BI) applications to consume.

Ultimately, through Azure Data Factory, raw data can be organized into meaningful data stores and data lakes for better business decisions.

The below article explains the steps to create Data Factories, which can then be provided with the input data (pipelines) and publish the output to data stores.

Data Factory Creation

Azure Data Factory

Login to Azure Portal and Navigate to Azure Data Factories.

By searching for Data Factories in “All Services”, the above page can be obtained.

 

Azure Data Factory

 

  • Click “Add” in the Data Factories tab.
  • Specify the necessary details.
  • Click Create.

    The latest Data Factory version is V2 and currently, it is in Preview state.

    Azure Data Factory
  • The specified details are being validated.

    Azure Data Factory
  • Creation of Azure Data Factory in progress

    Azure Data Factory

    Azure Data Factory

Once the Azure Data Factory is created successfully, the same can be verified in the Data Factories tab of Azure Portal.

SUMMAR

In this article, I have provided the simple steps to create Azure Data Factory. Post the creation of Azure Data Factory to the Input Pipelines and the output data should be configured for full-fledged usage of the Data Factory.