Azure Virtual Machine Scale Sets

You can create and manage a group of identical and load balanced virtual machines using the Azure Virtual Machine Scale Sets. The number of VMs can increase or decrease on its own according to demand or a defined schedule.

You get high availability to your applications and you can also centrally manage, configure, and update a large number of virtual machines using the scale sets.

Why use it?

You should use it to have redundancy and improved performance in your applications because they are distributed over multiple instances. A load balancer is configured that distributes the requests coming from the users to one of the application instances.

So, if there is a need to update your application or if there is a need for maintenance, the users of your application can be redirected to the other instance of your application. In order to keep up with the high demand of requests, you may need to increase the number of application instances that run your application.


All the Virtual Machine instances are created from the same base OS image and configuration. Because of this, you can easily manage any number of Virtual Machines without having to do additional configuration tasks or any kind of network management.

Scale sets support the use of Azure Load Balancer for basic layer-4 traffic distribution and Azure Application Gateway for more advanced layer-7 traffic distribution and SSL termination.

For additional availability of your Virtual Machines, you can use Availability Zones to automatically distribute VM instances in a scale set either in a single data center or across multiple datacentres.

Scale sets support up to 1000 VM instances. If you create and upload your own custom VM images, the limit is 300 VM instances.

Scale-out and Scale-In (Enable when you do auto-scaling on)

Supposing you have two initial VM instances and you keep the threshold limit as 60% and decide to have 10 more instances of your VM; what this will do is once your original VM instance’s CPU reaches 60% threshold, a new instance of your VM will be created. So, then you will have total of three instances of your VM. And further, if the third instance’s CPU threshold crosses 60%, it will create another instance; i.e. a fourth instance. This is called as scaling out of your VMs.

For scaling in of your VMs, suppose you keep the threshold as 25%, what it will do is if the CPU threshold of your newly created instance becomes less than 25%, it will destroy that instance of your VM and hence saving money for you. By this way, resource utilization is done.

Manual Group of VMsVirtual Machine Scale Set
You need to create and configure the VMs manuallyAutomatic creation from the central configuration
You need to create and configure Load Balancer and Application Gateway ManuallyAutomatic creation and integration of Load Balancer and Application Gateway
You need to create Availability set or zones manuallyAutomatic creation of Availability sets or zones
You need to manually monitor the VMsAutoscale based on host metrics, in-guest metrics, Application insights, or schedule

One more benefit of this is that there is no additional cost to scale sets. You only pay for the underlying compute resources such as your virtual machine instances or load balancers or a managed disk storage.