Microsoft Fabric  

Microsoft Fabric Architecture Explained

This article provides an intermediate-level overview of Microsoft Fabric's architecture, focusing on key components like the OneLake, Fabric capacity, workspaces, and security boundaries. It also explains how Microsoft Fabric integrates with Power BI, Synapse, and Data Factory, offering a comprehensive understanding of this unified data analytics platform.

fabric icon

Core Components of Microsoft Fabric Architecture

As explained in the last article, Microsoft Fabric is designed as an end-to-end, unified analytics platform that simplifies data integration, data engineering, data warehousing, data science, real-time analytics, and business intelligence. Its architecture revolves around several key components, like:

  1. OneLake

  2. Fabric capacity

  3. Workspaces

  4. Security Boundaries

fabric-architecture

OneLake: The OneDrive for Data

At the heart of Fabric lies OneLake, which is a single, unified, logical data lake for the entire organization. Focus on organization here as the entire organization only has a single Onelake. Think of it as "OneDrive for data." Instead of having multiple, different data silos, OneLake provides a single place to store all your structured, semi-structured, and unstructured data.

Key features of OneLake

  • Hierarchical Namespace: Unlike blob storage, OneLake uses a hierarchical file system, similar to Azure Data Lake Storage Gen2 (ADLS Gen2), allowing you to organize your data into folders and subfolders.

  • Shortcuts: OneLake supports shortcuts, which are nothing but pointers to the data stored in other locations, such as ADLS Gen2, S3, or even other OneLake instances. This allows you to virtually integrate data from other sources without physically moving it to your onelake.

  • Delta Parquet Format: OneLake natively supports the Delta Parquet format, which provides ACID (Atomicity, Consistency, Isolation, Durability) transaction capabilities and schema evolution capabilities for your data lake.

  • Open Access: OneLake is built on open standards and supports open-source APIs, making it easy to access and process data using various tools and languages.

onelake-features

Fabric Capacity: The Compute Engine

Fabric capacity is the compute resources allocated to your Fabric environment. It's the engine that powers all the Fabric workloads, such as data integration, data engineering, data warehousing, and data science.

Key aspects of Fabric capacity

  • Capacity Units (CU): Fabric capacity is measured in terms of Capacity Units (CU). The number of CUs you need depends on the size and complexity of your workloads and tasks you want to perform.

  • Scalability: Fabric capacity is scalable, allowing you to increase or decrease the number of CUs as and when needed to meet the changing demands.

  • Billing: Fabric capacity is billed on a pay-as-you-go basis, so you only need to pay for the resources you consume.

  • Workload Isolation: Fabric capacity provides workload isolation, ensuring that each workload has the resources it needs to perform optimally.

fabric-capacity

Workspaces: Collaboration Hubs

Workspaces, within Fabric, are collaboration hubs where teams can work together on data projects. Workspaces provide a secure and organized environment for storing and managing data, reports, and other artifacts.

Key features of Workspaces:

  • Role-Based Access Control (RBAC): Workspaces support RBAC, allowing you to control who has access to your data and resources.

  • Version Control: Workspaces also provide version control capabilities, allowing you to track changes to your data and reports.

  • Collaboration: Workspaces facilitate collaboration by allowing multiple users to work on the same project simultaneously.

  • Lifecycle Management: Workspaces support lifecycle management, allowing you to promote your data and reports from development to test to production environments.

workspaces-features

Security Boundaries

Security is a critical aspect of Fabric architecture. Fabric provides several security features to protect your data and resources.

Key security boundaries

  • Tenant Level: At the tenant level, we can configure security policies that apply to all the Fabric resources within your organization.

  • Workspace Level: At the workspace level, you can control who has access to your data and reports using RBAC.

  • Data Level: At the data level, you can use row-level security (RLS) and object-level security (OLS) to restrict access to specific rows or columns of data.

  • Network Isolation: Fabric supports network isolation, allowing you to restrict access to your Fabric resources from specific networks.

fabric-security

Integration with Power BI, Synapse, and Data Factory

Fabric seamlessly integrates with Power BI, Synapse, and Data Factory, providing a unified experience for data analytics.

Power BI Integration

Power BI is deeply integrated within Fabric, allowing you to easily create reports and dashboards from data stored in OneLake.

Key integration points

  • Direct Lake Mode: Power BI supports Direct Lake mode, which allows you to directly query data stored in OneLake without importing it into Power BI. This mode provides faster performance and reduces data duplication.

  • Power BI Datasets: You can create Power BI datasets from data stored in OneLake, allowing you to build complex data models and calculations.

  • Power BI Reports: You can create Power BI reports from Power BI datasets, allowing you to visualize your data and gain insights.

powerbi-integration

Synapse Integration

Synapse provides a comprehensive set of tools for data warehousing, data engineering, and data science. Fabric integrates with Synapse, allowing you to leverage these tools to process and analyze data stored in OneLake.

Key integration points

  • Synapse Data Engineering: You can use Synapse Data Engineering to build data pipelines that ingest, transform, and load data into OneLake.

  • Synapse Data Warehousing: You can use Data Warehousing in Synapse to build a data warehouse on top of OneLake, allowing you to perform complex analytical queries.

  • Synapse Data Science: You can use Synapse Data Science to build machine learning models that analyze data stored in OneLake.

Data Factory Integration

Data Factory is a cloud-based data integration service that allows you to build data pipelines that move and transform data. Fabric integrates with Data Factory, allowing you to use Data Factory to ingest data into OneLake from various sources.

Key integration points

  • Data Factory Pipelines: You can use Data Factory pipelines to ingest data from various sources, such as on-premises databases, cloud storage, and SaaS applications, into the OneLake.

  • Data Factory Data Flows: You can use Data Factory data flows to transform data as it is ingested into OneLake.

In conclusion, Microsoft Fabric offers a comprehensive and integrated architecture for data analytics. By understanding the core components and integration points, you can effectively leverage Fabric to build powerful data solutions.