What Is Azure Active Directory

In this article, we’ll learn mainly about Azure Active Directory (AD), Azure AD Fundamentals, the use cases of Azure AD, and how Azure AD can be implemented for Azure Data Services. We’ll also get introduced to various Data Services in Azure such as the Azure Data Lake, Azure Synapse Analytics, Azure SQL, Azure Databricks, and more.

Azure Active Directory (AD) 

Azure Active Directory is the identity and access management service of Microsoft which is completely cloud-based that enables resource access and signing into different services such as the Microsoft 365 ecosystem, Azure Portal, and numerous SaaS applications. Moreover, it can also be used to access the internal resources of applications in the corporate network and intranet.

Active Directory and Azure Active Directory

First and foremost, it is important to realize Active Directory and Azure Active Directory aren’t the same things. The Active Directory was first introduced in Windows 2000 to provide organizations the capability to manage multiple on-premises infrastructure systems and components using one identity per user. Over time, Microsoft has introduced the Azure AD which has leveled up this approach to provide organizations with Identity as a Service (IDaaS) solutions that cater to all the apps across the cloud as well as the on-premises. There are numerous similarities between the Active Directory and Azure Active Directory such as the additional features in Azure Active Directory like the Azure AD Connect that sync identities to the cloud, the Azure AD B2B which manages the link to the external user identity to confirm the validity and more.

Azure Data Services 

While working with Azure Data Services, it is next to impossible to not bump into the Azure AD. For various services in Azure Data Services such as the Azure Data Lake, Azure SQL, Azure Databricks, Azure Data Factory, and Azure Synapse Analytics, the Azure AD plays a vital role for numerous purposes from identities, authorization, and authentication. Let us learn about them in brief.

Azure Data Lake

Azure Data Lake, just like the name suggests, contains the capability to store data in a scalable way and analytics services which comes in handy for developers, analysts, and data scientists. The complexity of ingesting and storing data is all solved with batch, streaming, and interactive analytics made faster. 


Source: Microsoft Azure

Azure SQL

Azure SQL can be understood as the completely managed intelligent SQL database service provided as the segment of Microsoft Azure. The Azure SQL is built on the SQL Server engine and is highly secure. From deployment options to control requirements for modernization initiatives and migration, the Azure SQL supports all of it. Also, feel free to learn more about the Implementation of GPDR in Azure SQL from the previous article, Implementation of GDPR with SQL Server and Azure SQL Database

Azure Synapse Analytics

When Azure Synapse Analytics comes into mind, the word limitless is what describes it. Formerly known as the Azure SQL Data Warehouse, it isn't just a data warehouse tool anymore. The Azure Synapse Analytics now combines the power of enterprise data warehousing with big data analytics making it extremely powerful. The Synapse Studio supports workspace creation for data preparation, data management, and exploration, big data, data warehousing, and even Artificial Intelligence tasks.


Source: Microsoft Azure

Azure Databricks

Azure Databricks is specifically designed for data engineering and data science work which provides easy and fast big data analytics services that are based on Apache Spark. It can be understood as the data analytics platform which is optimized for the Azure services.


Source: Microsoft Azure

Azure Data Factory

The Azure Data Factory is an ETL service provided by Azure in the cloud that can scale out data transformation and data integration. The code-free UI provided by the Azure Data Factory supports intuitive authoring, single-pane-of-glass management, and monitoring. The SSIS packages can be shifted to Azure and then run in ADF with full compatibility.

Azure AD Fundamentals 

Azure AD is the core of all Microsoft services. Office 365, Power BI, and Azure all use Azure AD. All the users of these services are stored in the Azure AD itself. All of these services require Azure AD to function. This explains the importance of Azure AD to Microsoft and also gives a view of its capability from security to reliability prospects. Thus, is the central hub for all of the Microsoft Cloud services. From User and Group for teams, Applications and Device identities all can benefit from Azure AD.

Synchronization

As we have noted above, the Active Directory and Azure Active Directory aren’t the same things, and it's important to distinguish each of them. Azure is built for the cloud and uses modern protocols whereas Active Directory is there mostly for on-premises. Thus, whenever services use features for authentication, it uses Active Directory for the on-premises and as they simultaneously use the Cloud services of the same program, the synchronization is essential. Active Directory uses protocols such as Kerberos while Azure Active Directory uses ones such as OAuth 2.0. Moreover, guests can be invited to your application with these services, and with synchronization, the programs can be swiftly authenticated and authorized.

Azure AD Use Cases
 

1. Create and Manage Azure Resources

 Azure plays an important role while creating and managing the Azure Resources. From using Azure SQL or Azure Data Factory, to create and configure users and then giving access to these users, the users must be in the Azure AD. And the creation and management of the Azure Resources are controlled with the Azure RBAC rules. Here, the Azure AD is doing the authentication – from password, multi-factor authentication, and the roles provided from Azure RBAC roles – owner, contributor, and reader are accessed through Azure AD. Besides, Infrastructure as Code can also be set up through Bicep, Terraform, and Pulumi.

2. Working with Data 

Azure AD is used in cases when you hgave to work with data. Let us take an instance of reading and writing data in an Azure SQL Database. So, when we are creating an Azure SQL database, it asks for SQL authentication of SQL admin. This necessarily doesn’t have anything to do with Azure Active Directory but as the Azure SQL is created, we can find Azure Active Directory in the settings, and through this, we can specify the Azure Active Directory administrator. So, with this, we can switch from requiring SQL authentication to Azure AD authentication. This Azure AD user, however, should be present in the Azure AD priorly. Thereafter, new users to access the SQL database can be created, access to different services can also be setup, and many more functionalities can be explored with ease.

If you want to learn more about Azure AD, watch this video.

Azure AD for the Azure Data Services

Now, let us has an overview of the entire landscape of the Azure Data Services and where Azure AD fits in here.

Azure SQL

For the Azure SQL, there are two ways for authentication. First is the default method of SQL Authentication and the second is the Azure AD Auth. Azure SQL was built on SQL that existed before Azure. We know that in the SQL Database, we go into the Azure SQL database itself first and create the users and give them the required permissions and it is not done in the Azure Portal. Hence, there is no management option in the management layer for the user access rights. Hence, it is a bit different than other resources.

Azure Data Lake 

Azure Data Lake is slightly different. Wherein the Azure SQL was built on the SQL Database that pre-existed it, the Azure Data Lake was built natively completely on Azure itself. So, contrary to the authentication options in Azure SQL through the SQL authentication, Azure Data Lake provides the fewer option. One is through the Access Keys. However, it is suggested not to use this for authentication as it gives access to everything in your account. Amore restricted option is through the Shared Access Signatures. The Shared Access Signatures is a temporary key for storage data lake account and can be managed programmatically. Moreover, a better third option is through the Azure RBAC which can be added later on. Data reader is available and different roles of reader, contributor and other can be specified which is pretty neat.

Cosmos DB 

Cosmos DB has some similarities with Azure Data Lake.

The first option is through the Primary Keys – read/write or just read which is synonymous to access keys in Data Lake. The second option for the authentication is by using the Resource Tokens which are a little hard to manage but granularity to limits for the access is great as it is not tied to a particular individual. The third option is through the Cosmos DB RBAC which is newly added and provides resource-based access control. Also, roles can be specified and assigned with ease.

Azure Databricks

Here in Azure Databricks, Service Principal and App Registration needs to be created in Azure AD to allow Azure Databricks to connect to storage in Data Lake.

Azure Data Factory

In Azure Data Factory, Managed identities are used for authentication and authorization purposes for different roles.

Azure Synapse Analytics 

With Azure Synapse Analytics, it’s a collection service such as SQL Pool where we use processes like in Azure SQL, for Pipelines which is similar to Data Lake Factory and thus, different bits and pieces of approaches are used for authentication methods in the Azure Synapse Analytics.

Why should we use Azure AD? 

We’ve learned how Azure AD can transform the working methodology and make it convenient to access, authorize and authenticate for use of different Data services. Each user and application have their own identities in Azure AD which makes it easier to provide in the roles and limit access. Moreover, one can use Azure AD capabilities for whatever services one wants to use.

Conclusion

Thus, in this article, we learned about Azure Active Directory. We got in-depth about the fundamentals of Azure AD, the various use case scenarios of the Azure AD, and the process it can be implemented in the Azure Data Services.