Introduction To Azure Data Lake

Background

In this blog, we will walk through Azure Data Lake Store feature. Azure Data Lake Store is ‘Generally Available’ from Nov 2016 and is among the fastest emerging Azure Service.

Introduction

In my past Azure articles, we have learned about how to create virtual machines, Data Warehouse and Azure app Service as a platform-as-a-service (PaaS) subscription from Microsoft Azure. In case you did not get a chance to walk through, please first read the articles, mentioned below.

Read More

After Azure Data Lake Store public view availability, it has grown a lot because of the incredible offerings.

What is Azure Data Lake?

Azure Data Lake offering is not limited/restricted to data size, type, platform and features. However, it has been introduced with the supporting features of the batch, streaming, interactive analytics etc. Now, data scientists, data analysts and the data developers can store any size of data, shape, and with considerably faster speed. Azure Data Lake (ADL) not only makes it possible to store Big Data but also offers good services, mentioned below.

  • Azure Data Lake Store
  • Azure Data Lake Analytics
  • Azure HDInsight

Azure Data Lake Store (DLS)

DLS is a no-limits data lake to power intelligent action. With DLS, we can store trillions of files. Hence, we can say that DLS is most suited to the security Server data, large audio/video (e.g Youtube) and Security Insurance Data of the whole country.

Data Lake Store scales any size of data and it can provide massive output to run analytic jobs with thousands of concurrent users that read and write hundreds of terabytes of the data efficiently.

Data Lake Store protects the data as the data is always encrypted by adhering to the security and regulatory compliance. For more details of Azure Security, read more at.

Now, please login to your Azure Account at https://portal.azure.com/

Click on Azure –More Services ->Data Storage and then Data Lake Store.



Furthermore, Microsoft Azure Data Lake Store supports any Application that uses the open Apache Hadoop Distributed File System (HDFS). By supporting HDFS, we can easily migrate your existing Hadoop and Spark data to the Cloud.

Azure Data Lake Analytics (DLA)

An on-demand analytics job Service is required to power an intelligent action. With DLA, ease to improve the scalability of the database increased tremendously. Scalability can be increased in the minutes and we have to pay for what we use.

DLA has a massive support from Parallel system and ‘PETABYTES’ of the data, as it can be processed easily for different categories. In addition to it, we can count on the enhanced SSO (Single Sign ON), multifactor authentication.

Azure HDInsight
 
 
HDInsight offers Hadoop Service for enterprise. It has reliable open source analytics, architecture for full redundancy, and data geo-replication. HDInsight also supports SSO etc., which are a few among many. Azure machine type enables the utilization of the resources and we only have to pay for the computing and storage.

In addition, HDInsight has a good support of the integration with independent software vendors (ISVs).
A recent study showed HDInsight delivered 63% lower TCO than deploying Hadoop on-premises and industry’s best 99.9% SLA and 24/7 support. (Above figure is heavily borrowed from Microsoft Azure Site)

Conclusion

Nowadays, it’s imperative to look over the cloud very seriously. Azure Data Lake is among the fastest evolving services and will be a contributing factor for any cloud based enterprise solution.
IotCoast2Coast
Improving the World of Tomorrow :- We offer application development and support services.