Azure Logic Apps Standard (Let's Dig Down Deep)

Hi Folks!

Let’s dive into the internal working of the newly released (GA) Azure Logic Apps Standard runtime and try to understand how Microsoft is managing the show behind the scenes.

By now you must be aware that it comes up with 2 types of workflows i.e stateful and stateless and it is built on the top of Azure Function App extensions. It means logic app standard can be deployed and executed everywhere wherever you can Function App. The way I see it, the future is going to be multi-cloud/on-premise. The GA already supports local execution of the workflows but on VS Code only.

I have been working on it for the past 8 months even when it was on the public preview. In this article, I am just trying to share my understanding based on the experience and knowledge I have gained from Microsoft documentations.

What you will learn from this article?

Logic App Standard runtime

ASP plans (WS1, WS2 & WS3) and how & what to consider for throughput.

Important runtime configuration settings for high throughput in host.json

Logic App Standard runtime

To create a Logic App standard resource, it will also ask you to associate a storage account either by using the existing or creating a new one. (Similar to Function App creation)

Below is the diagrammatical representation of Logic Apps Standard runtime and we will see how it works internally. This runtime also uses a storage account (queue, table & blob) for stateful workflows only.

Azure Logic Apps

Logic App Standard runtime

Let’s explore and understand it point by point as marked in the above figure :

# 1) In this new Logic Apps Standard resources everything defined within the workflow (actions) executes as jobs.

This new runtime converts each workflow into a DAG (directed acyclic graph) of jobs using the job sequencer and the complexity of the DAG of jobs varies as per the steps defined within the workflow. Please note that this will happen only when there will be a request/event for the workflow trigger.

Suppose you have used 3 workflows to develop and deploy the business process as shown in the below example:

WF: workflow

Business process = WF1 +WF2 +WF3

WF 1 (Parent) — (Nested Call)→ WF 2 (Child 1) (Nested Call)→WF3 (Child 2)

Now the runtime will create a DAG of jobs for the workflow actions by using workflow job sequencer in the following order:

  • Create DAG of jobs for WF1 and execute them. (execution of jobs we are just going to understand in more detail in the next step)
  • In the end, WF1 will invoke the trigger of the next workflow WF2 (Child 1)
  • Create DAG of jobs for WF2 and execute them.
  • In the end, WF2 will invoke the trigger of the next workflow WF3 (Child 2)
  • Create DAG of jobs for WF2 and execute them.

# 2) Orchestrator engine (Job dispatcher)

For stateful workflows, the orchestration engine schedules the jobs in the job sequencers by keeping them as storage queues. Within the orchestration engine, there are multiple job dispatchers worker instances running at the same time and these multiple job dispatchers worker instances can further run on multiple compute nodes (when using more than one instance within Logic App Standard resource).

Default settings for the orchestration engine at the storage account are,

  • single message queue
  • single partition

Please note that for stateless workflows orchestration manages jobs execution in-memory instead of using a storage queue. That’s the reason stateless workflows are better in terms of performance but when it comes to the stability of the Logic Apps Standard runtime just in case if it gets restarted due to some reason then you will lose the state of running instances of stateless workflows.

# 3) Let’s see what is exactly happening inside the storage account,

queue

In the previous point #2, we saw the purpose and how the Logic App standard orchestration engine uses it to schedule the workflow DAG jobs created by the job sequencer. To increase the efficiency of Jobs dispatchers the no. of queues can be increased in the host.json. Consider the latency of workflow’s jobs executions, CPU usage %, and memory utilization % before increasing the queue else it will slow down the overall processing time.

table

Logic App Standard runtime uses it to store workflow definitions along with host.json files. In addition, it uses it to store the job’s checkpoint state after each run to support retry policy at the action level which also ensures “at least once run” for each action. Along with the state of the job it also stores its input and output within the table.

blob 

In case the size of the job’s input and output is large then Logic App Standard runtime uses blob storage.

Below is the screenshot of the storage account which I had created while creating Logic Apps Standard resource. You can clearly see Azure has created one default queue and job definition table to support Logic Apps Standard runtime jobs execution.

Azure Logic Apps

# 4) Orchestration engine uses the underlying Azure Function runtime environment to execute the jobs. While creating the Logic App Standard resource you can either use workflow or docker container as the underlying Function App runtime hosting environment.

Azure Logic Apps

Available hosting ASP plans

Currently, it comes with 3 production level ASP plans (WS1, WS2 & WS3)

Azure Logic Apps

Considering the throughput you want to maintain accordingly you can select one of them.

Azure Logic Apps

Please note that vCPU indicates the no. of cores available in one instance. It can help you to identify the indicative number of the max concurrent jobs allowed per core by setting Jobs.BackgroundJobs.NumWorkersPerProcessorCountin the host.json which is 192 as default. Considering the latency of workflow’s jobs executions, CPU % usage and memory utilization % this value can be increased or decreased.

If the CPU % usage goes to 70%, you can decrease this number to reduce the concurrent workflow’s jobs runs (throughput) or if the CPU % usage is low then it indicates the under-utilized compute and you can still increase this number to maximize the concurrent workflow’s jobs runs (throughput).

Important runtime configuration settings(for throughput)

Jobs.BackgroundJobs.DispatchingWorkersPulseInterval | Default is 1 sec, use it to control the polling interval of job dispatcher to pick messages(next job) from the storage queue.

Jobs.BackgroundJobs.NumWorkersPerProcessorCount | Default is 192, this setting can be used to control/restrict concurrent workflow’s jobs run(throughput) per core.

Jobs.BackgroundJobs.NumPartitionsInJobTriggersQueueDefault is 1, this can help only in case multiple job dispatchers are running on a large no. of instances on multiple compute nodes behind the scene.

Please note that the above-mentioned settings can be set in the host.json file.

Thanks!