Service Fabric Deallocation Using PowerShell To Save Costs

While implementing Internet of Things (IoT) scenarios, customers seem to be very confused about the costs associated with the resources and need professional help to reduce its total monthly Azure Subscription costs.

One of the scenarios which I am going to discuss is when there is an environment created by the customer (say UAT) that is very rarely used. The customer has deployed multiple services in a resource group including AppServices and ServiceFabric Cluster Node with multiple applications. For some time, the customer does not realize the costs associated with Service Fabric Clusters (which can be found here with Microsoft providing pricing calculator for each resource you create). After some time, they realize that it's costing them too much for their liking and now, they need to get rid of this overhead since they are not using all the environments all the time. At the same time, we should not be deleting and recreating the SF Cluster but reusing the deployed infrastructure. Let's discuss what options I came across while trying to implement this.

One of the options that I was looking for was a feature to shut down Service Fabric Cluster as a one-click option. But unfortunately, there is none of that sort. This is unfortunate, I think, as there is an option to start/stop AppServices, but probably since multiple applications are deployed to a cluster, Microsoft wanted to do this using Service Fabric Explorer only...Yes, you heard it right.. you can do this from Service Fabric Explorer. But we are discussing here a way to automate hang on.
Anyway, after trying for some time, I found some useful information on MSDN and the Internet that it cannot be shut down but rather you need to individually start/stop the Cluster Nodes. That gave me some direction and I started looking into the options to implement this.

First, PowerShell set of commands that I worked on is Start-ClusterNode and Stop-ClusterNode. To implement these, you need to follow some steps -
Then eventually, you could stop the nodes using the commands given above in the links. This could be done but there were a couple of problems with taking this approach - It was a cumbersome job to first do all operations and then eventually start/stop the nodes; when I did this, I was unable to connect to the cluster anymore. That left our system unstable (probably because I shut down all the nodes and therefore in this process, I also shut down the main primary node where all major services required to run Service Fabric Cluster were running.).
So, I started looking for another alternative instead of finding and fixing issues associated with the above commands. (Also, now these commands (Start-ClusterNode and Stop-ClusterNode) are OBSOLETE.)

While looking for another alternative, I came across another command "Start-ServiceFabricNodeTransition" which was similar to start-clusterNode and Stop-ClusterNode. Of Course, I had to do all the pre-requisite work mentioned above. Moreover, there is a major drawback for my requirement in this command. After shutting down the nodes, it will be up after 14400 seconds (4 hours) automatically, because -StopDurationInSeconds is required parameter while stopping nodes and it only accepts the value between 600 and 14,400 seconds. So, I cannot use it either to shut down the nodes permanently.

Finally, I came across a way to deallocate a node which will save the cost and will not be up unless I request it to... Awesome, and that is what I was looking for... :) Let's see how to do this.

So, if you open the Azure portal and open your Resource Group, you can search for Azure Service Fabric Cluster.
If you observe closely while searching Cluster, Service Fabric cluster deploys multiple other resources along with it, such as "public IP's","load balancers","Virtual Machine Scale Set" etc.

If you open "Virtual Machine Scale Set", you can navigate to instances tab within it and select and deallocate and reallocate the nodes.
Hurray!!! This is what we wanted to do. Once you deallocate the nodes, no cost will be associated with it. Next time, when you need to use it, please reallocate.

But be careful, once you deallocate, you will lose all your deployed applications. So, redeploy using Release Management or any other tool/methodology you are using in your project.

Above is the manual step to do it. The Stop-AzureRmVmss command can be used to do so using PowerShell.
Create a script that will be able to automate the process. It comes with a couple of benefits. One, you do not need to connect to Cluster to do this. You should be connected to subscription and that's all... Second, once you deallocate the nodes, you can still connect to cluster.
Here is the command,

Stop-AzureRmVmss -ResourceGroupName "ContosoGroup" -VMScaleSetName "ContosoVMSS"
(taken from the above link, there are two variants of it.).

So, using the above command, you can deallocate your Service Fabric resources and save money for yourself and customers.


I have not explained the commands as they are very well documented in the links given above. But let me know if you want to discuss this. I'll be happy to help there. Please feel free to add or correct my observations. 

Build smarter apps with Machine Learning, Bots, Cognitive Services - Start free.

Start Learning Now