This write-up is in continuation of my previous one and here, we will be sketching how to get started with Azure Machine Learning and get an introduction to Azure Storage.
Key points of this article, in short
- Kickstart Azure Machine Learning.
- Overview of Azure Storage.
- Creating an experiment in Azure ML Studio.
Get started with Azure Machine Learning Studio
Follow the below steps to get started with Azure Machine Learning Studio.
- Microsoft account – Outlook or Hotmail email account.
Try signing up from http://studio.azureml.net/ for Azure Machine Learning Studio. Once signed-in, you will land on the following page.
The home page of the Machine Learning Studio has several menus and options on it.
“Microsoft Azure Machine Learning Studio” will help you to take forward the browser for the homepage;
“Projects” deals with the collection of experiments, datasets, notebooks and other resources representing a single project;
“Experiments” helps in experimenting the project that we have created,
“Web Services” helps in deploying the experiments as web services namely web apps,
“Notebooks” helps in moving with Jupyter notebooks,
“Datasets” deals with the datasets that we have uploaded in Azure Machine Learning Studio,
“Trainer Models” helps in training the models and save the experiments on the Azure ML Studio,
“Settings” goes with a collection of settings that we can use to configure our account.
An experiment normally holds at least one dataset which provides the data for analytical modules to help us to connect together and construct a predictive model. An experiment will be having the datasets connected to modules with workflows and required parameters for each module. There are a lot many experiments available in Azure ML Studio.
Datasets are used in the modeling process. We have many datasets available in Azure ML Studio. Datasets can be of any kind, like calculations of diagnostics data for cancer, credit card fraud transaction details, etc.
Modules are algorithms that can perform on the data provided. This holds training, scoring, and validation process. After completing all these things, we can deploy the Azure ML experiment as a web service.
Brief introduction of Azure Storage
Azure Storage is of high availability, multi-tenant service designed for flexible storage options and it consists of four services - blobs, files, tables, and queues. Azure Storage supports scalability, durability, high availability, and cost-effectiveness. Azure Storage is multi-tenant which refers to an architecture in which a single instance of software runs on a server and serves for multiple tenants (a group of users who share the same resources).
Each storage over here has its own HTTP or HTTPS endpoint. For example, a storage account created as najumamahamuth will be as follows:
- Blob – https://najumamahamuth.blob.core.windows.net
- Table – https://najumamahamuth.table.core.windows.net
- Queue – https://najumamahamuth.queue.core.windows.net
- Files - https://najumamahamuth.files.core.windows.net
This storage service can support up to 500 TB of capacity with 20,000 IOPS and up to 20 Gbps in and 30 out for LRS - up to 10Gbps in and 20 out for GRS - and helps in security through management certificate, RBAC, or name and authentication key.
Azure Storage comes up with four types of replications -
- RA-GRS – Read Access - Geo-redundant Storage – This enables the replication target for secondary locations the same as for the primary location.
- GRS – Geo-redundant Storage.
- ZRS – Zone-redundant storage which is available only for Block blobs.
- LRS – Locally-redundant Storage.
Blob Storage – used for storage of files, VHD’s, mp4s, PNGs, .txt, etc.
Blob Storage URL can be illustrated simply as below.
Blob comes with two types - Block Blobs and Page Blobs.
- Block Blob – this supports up to 200 GB which is designed for uploading and downloading in blocks; recommended for storage of movies, images, text files, etc., which is commonly used for downloads.
- Page Blob – this supports up to 1TB and it is designed for reading and writing in 512 bytes of pages; this is recommended for applications that support seek and random/read-write like virtual hard disks.
Table is another highly scalable service for non-relational structured data.
Tables are used for massive auto-scaling with NoSQL store. Helps for user, device and service metadata, structured data. With its schema-less entities which come with strong consistency, we have no limits on the number of table rows or table size; it helps us with dynamic load balancing of table regions and is best for key/value lookups on partition key and row key.
Queues – this is for low latency message processing.
Queues are a reliable messaging system at scale for cloud services. It’s for decoupling components and scaling them independently, scheduling of asynchronous tasks, building processes, and workflows, no limits on the number of queues or messages, message visibility timeout to protect from component issues, update the messages to checkpoint progress part way through.
Files – SMB (Server Message Block) file sharing is a common internet file sharing system commonly used for sharing files, printers, and ports or communications between nodes of a network. Helps us in Lift and Shift on-premises applications; it's natively supported by OS API’s, libraries and tools; it's built on SMB 3 which can work with Windows and Linux followed by no limit on shares.
Surf more about all these tenant services over here.
Working with sample datasets
To work with sample datasets which we have in Azure Machine Learning Studio, we should first create an experiment on it. So, let's start with creating an experiment as a start.
Go to Azure Machine Learning Studio page using - http://studio.azureml.net/
Sign up or sign in with your Microsoft account credentials here. It is free of cost.
Click on "Sign in" to sign in with your Microsoft account credentials.
You can go to the Machine Learning Studio either by this way or by using Azure portal - https://portal.azure.com
New - AI + Cognitive Services - Machine Learning Experimentation (preview)
Now, move back to Azure Machine Learning Studio and click on New - Experiments - Blank Experiments.
Rename the experiment over here at the workspace, as shown below. Here, I have renamed it as Digit Recognition.
Now, the experiment has been created, and the next step is to upload the datasets.
Follow my next article to upload datasets on Azure ML Studio and to work with a scenario for this experiment.