Introduction To Delta Sharing For Secure Data Sharing

Introduction to Delta Sharing for Secure Data Sharing
Image Credits: https://www.techrepublic.com/

Introduction

Every enterprise organizations use data now a days to derive business insights to increase their business. Data is the new oil and many enterprise organizations are focusing more on collecting data from the different sources work on the data driven projects. Once the data is collected, it becomes important for organizations to define a governed and secure approach to share the data.

Data bricks Delta Sharing for Secure Data Sharing

Delta sharing is an open source standard for secure data sharing. Delta sharing makes it simple for the data driven organizations to share the data easily and efficiently.

Features of Delta sharing are as follows,

  • Live Data Sharing
    Delta sharing makes it possible for data driven projects to easily share existing data as well as live data with delta lake without physically copying it to any other system.
     
  • Support for multiple data consumers
    Data consumers can leverage data directly using delta shares with pandas, apache spark, and other systems without deploying the same to any other cloud platform or on-prem platforms. It provides end users the flexibility to consume the data faster.
     
  • Increased governance and security
    With delta sharing, enterprise organizations can govern the data and keep the live tracking and auditing of the data shared between different teams.
     
  • Scalability
    With delta sharing, teams can share big data efficiently using various cloud storage providers like Azure Data Lake Storage Gen2, AWS S3, and Google Cloud Storage.

Clients using Delta sharing for Delta Lake

Many enterprise organizations and various tools have already started using delta lake for data sharing. Below Tools and vendors highlighted in the image are already using delta sharing to share the data.

Working with Delta Sharing Delta Lake

Delta sharing with Delta Lake is a based on simple REST protocol to securely share and access the data from the cloud data sources.

Two main entities involved in delta sharing with delta lake are as follows :

  1. Data Providers
    Data Providers can share the existing table or partitioned table in the delta lake format. Delta Lake table is a collection of parquet files and it is easier to use existing parquet tables into delta lake.
     
  2. Recipients
    Recipients can consume the data using open source connectors like pandas, spark, python etc.

Data sharing using the delta lake delta sharing is performed using the below protocol,

  • Client authentication is performed using the bearer token and execute the query against the table.
  • When the client request comes to the server, server verifies request and executes the data from cloud or on-prem storage.
  • Server generates pre-signed URL which allows client to read parquet file from the cloud storage and transfer the data with bandwidth.

With delta sharing with delta lake, it supports multiple tools and tools available in the market to reduce the complexities of the overall architecture and eco system.

Tool Category

Introduction to Delta Sharing for Secure Data Sharing