Integrate Data Bricks Data Lineage With Azure Preview

Create service principal for Azure

  • Go to “Azure Active Directory”, then “App Registration” and then “New Registration
  • Give your service principal a name and click “Register”
  • Note down the tenet ID and client ID and a secret
  • Go to “azure purview”, then “access control ( iam)”. Add the role “data curator” to purview_api service principal under “add role assignments”.

Create Databricks runtime with Spline

  • Open Azure Databricks and create a new cluster. Create a cluster of your desired needs, but it must use the 6.4 runtime version. This is
  • a limitation of Spline since it does not have support for newer runtimes yet.
  • Then install the Spline packages from Maven.

Initialize Spline

Integrate Data Bricks Data Lineage With Azure Preview

  • Upload the “Spark Lineage Harvest Init.ipynb ” to your Databricks Environment
  • Run the initialization notebook with the code shown in the notebook you want to track

Conclusion

In this blog, we explored about how to integrate data bricks with Azure Purview to get data lineage with Data bricks notebooks using spline.