Application Scaling In Azure Kubernetes Service

What we will cover, 
  • Overview of Application Scaling in Azure Kubernetes Service
  • Implement the Scaling of Guestbook Application
  • Scale Frontend components of  Guestbook Application 

Overview of Application Scaling in Azure Kubernetes Service

When you are running an application, it requires scaling and upgrading which is a very critical stage for an application. The application consists of frontend and backend; scaling is required to handle additional loads within the application. Upgrading is required to keep your application up to date and to be able to introduce new functionality. The main advantage of using cloud-native applications is the Scaling which is in demand. It optimizes the application, for example, as I mentioned above the application has two parts, one is front-end and another one is Backend. If the frontend components encounter heavy loads you can only scale the frontend alone and your backend instances are the same. You can increase or reduce the number/size of Virtual Machines (VM) required depending on your workload and peak demand hours.
There are two scale dimensions for applications running on top of Azure Kubernetes Service. The first scale dimension is the number of Pods a deployment has, while the second scale dimension in Azure Kubernetes Service AKS is the number of nodes in the cluster.
By adding additional Pods to a deployment, also known as scaling out, you add additional compute power to the deployed application. You can either scale out your applications manually or have Kubernetes take care of this automatically via the Horizontal Pod Autoscaler (HPA). The HPA will watch metrics such as CPU to determine whether Pods need to be added to your deployment.
The second scale dimension in AKS is the number of nodes in the cluster. The number of nodes in a cluster defines how much CPU and memory are available for all the applications running on that cluster. You can scale your cluster either manually by changing the number of nodes, or you can use the cluster autoscaler to automatically scale out your cluster. The cluster autoscaler will watch the cluster for Pods that cannot be scheduled due to resource constraints. If Pods cannot be scheduled, it will add nodes to the cluster to ensure that your applications can run.
In this article, you will learn how you can scale your application. First, you will scale your application manually, and in the second part, you will scale your application automatically.

Implementing the scaling of Guestbook Application

To demonstrate manual scaling, let's use the guestbook example that we used in the previous articles. Follow these steps to learn how to implement manual scaling, 
  • Open your friendly Cloud Shell, as highlighted
  • Clone the GitHub repository following command all the files I have placed there,
    1. git clone      
    2. cd Azure-K8s/Scale-Upgrade/     
  • Install the guestbook by running the kubectl create command in the Azure command line,
    1. kubectl create -f guestbook-all-in-one.yaml    
  • After you have entered the preceding command, you should see something similar, as shown in the screenshot:
  • Right now, none of the services are publicly accessible. We can verify this by running the following command,
    1. kubectl get svc     
  • In the mentioned screenshot none of the services have an external IP,
  • To test out our application, we will expose it publicly. For this, we will introduce a new command that will allow you to edit the service in Kubernetes without having to change the file on your file system. To start the edit, execute the following command,
    1. kubectl edit service frontend    
  • This will open a vi environment. Navigate to the line that now says type: ClusterIP (line 27) and changes that to type: LoadBalancer, as shown in the mentioned screenshot below. To make that change, hit the I button, type your changes, hit the Esc button, type :wq! , and then hit Enter to save the changes:
  • Once the changes are saved, you can watch the service object until the public IP becomes available. To do this, type the following,
    1. kubectl get svc -w     
  • It will take a couple of minutes to show you the updated IP. Once you see the correct public IP, you can exit the watch command by hitting Ctrl + C (command + C on Mac):
  • Type the IP address from the preceding output into your browser navigation bar as follows: http://<EXTERNAL-IP>/. The result of this is shown in the mentioned screenshot.
The familiar guestbook sample should be visible. This shows that you have successfully publicly accessed the guestbook. Now that you have the guestbook application deployed, you can start scaling the different components of the application.

Scale Frontend Components of guestbook Application 

Kubernetes gives us the ability to scale each component of an application dynamically as I discussed above that we can scale the frontend and backend of the application separately --  this is the beauty of Kubernetes. I will show you how to scale the front end of the guestbook application. This will cause Kubernetes to add additional Pods to the deployment,
  1. kubectl scale deployment/frontend --replicas=6  
You can set the number of replicas you want, and Kubernetes takes care of the rest. You can even scale it down to zero (one of the tricks used to reload the configuration when the application doesn't support the dynamic reload of configuration). To verify that the overall scaling worked correctly, you can use the following command:
  1. kubectl get pods  
This should give you an output as shown in the screenshot,
As you can see, the front-end service scaled to six Pods. Kubernetes also spread these Pods across multiple nodes in the cluster. You can see the nodes that this is running on with the following command:
  1. kubectl get pods -o wide  
This will generate an output as follows,
In this part, you have seen how easy it is to scale Pods with Kubernetes. This capability provides a very powerful tool for you to not only dynamically adjust your application components but also provide resilient applications with failover capabilities enabled by running multiple instances of components at the same time. However, you won't always want to manually scale your application. Let’s move to the second part in which you will learn how you can automatically scale your application.