Copying Data from Azure Blob to Microsoft Lakehouse

Abiola David
1y
6.6k
0
1
25
Blog

Introduction

Microsoft Fabric is like an all-in-one toolbox for crunching numbers and making sense of data. It's got a bunch of tools, like moving data around, storing it in lakes, doing engineering work on it, mixing it together, using it for science stuff, keeping an eye on it in real-time, and helping you make smart business decisions. And the best part is it's got a strong foundation that makes sure your data is safe, plays by the rules, and follows all the important guidelines. In this article, we will learn how to create a data pipeline to copy data from the Azure blog and ingest it into a Lakehouse in Microsoft Fabric. Let’s get started.

Azure Blob to Lakehouse Data Pipeline in Microsoft Fabric

The first thing we want to do is to create a workplace that is typical of a container or organizing structure that allows us to collaborate on and manage content, such as reports, dashboards, datasets, and more.

To create the workspace, click on Workspaces and Click on New Workspace.

Provide a name for the workspace. In this article, DataPipelineFromAzureBlob is given.

Next, In the Data Factory platform, select Data Pipeline.
In the New Pipeline box, we provided Azure Blob Data, as seen below.
Click Create.
In the Start building your data pipeline, select Add pipeline activity and select Copy data.
In the Name box of the General tab, we provided Customer Data, as seen below.
In the Source tab, select External for the Datastore type.
Select New to create a new connection.
In the New Connection box, search and select Azure Blog Storage.
Click Continue.

In the Account name or URL of the Connection Settings, we provided the following.

https://azuresynapsestorage.blob.core.windows.net/sampledata/

In the Connection credentials tab, select Create a new connection in the dropdown for the Connection.

We provided Wide World Importers Public Sample as the Connection Name.
The Authentication kind is set to Anonymous.
At the bottom left, click on Create.

To access the .parquet files in https://azuresynapsestorage.blob.core.windows.net/sampledata/WideWorldImportersDW/parquet/full/dimension_city/*.parquet.

In the File path text boxes, we provided the following.