Azure Durable Functions - Constraints To Keep In Mind

Abhishek Gupta
3y
17.1k
0
1

Article

Durable functions in Azure seem to solve a crucial problem of being able to run a complex process that requires stateful coordination between components in a serverless fashion. They combine the power of serverless with the ability to run something that can take longer to finish and requires intermediate states to be preserved. So you still pay for the time the involved functions run just that the process can run for longer.

It's true that durable functions open up an opportunity to process something that is going to take longer and is complex in terms of data transfer and(or) stateful coordination between sub-processes, however, to achieve the same they come with certain constraints that can impact the execution of sub-processes or the way data is held and shared between components or even the extent to which parallelism can be achieved.

Therefore it becomes very important that when we choose durable functions to solve any of our use cases, we also look at the rules or constraints that we would need to honor so that we build our solutions in the most efficient and effective ways.

Let’s talk about some of such constraints which we can keep in mind,

Deterministic nature of Durable Functions

Durable functions let us define stateful workflows through Orchestrator functions. Orchestrator functions use event sourcing to ensure reliable execution and to maintain local variable states. The replay behavior of orchestrator code creates constraints on the type of code that you can write in an orchestrator function and one such constraint is that orchestrator functions must be deterministic. Being deterministic means an orchestrator function will be replayed multiple times, and it must produce the same result each time.

The Durable Task Framework attempts to detect violations in the deterministic behavior of the function. If it finds a violation, the framework throws a NonDeterministicOrchestrationException exception.

As a result of the deterministic nature of Orchestrator functions, some of the things which won’t work are,

DateTimeOffset.UtcNow

This value is going to be different in each replay hence it violates the deterministic behavior of the function. In order to have the current DateTime in an orchestrator function, theIDurableOrchestrationContext provides the ‘CurrentUtcDateTime’ property which can safely be used in the orchestrator function. This property will have the same value across replays until it is used along with IDurableOrchestrationContext.CreateTimer() method that accepts a DateTime telling the context till when it has to sleep. Once the function awakens, the CurrentUtcDateTime has the new value in place.

GUIDs and UUIDs

When Guid.NewGuid() executes, it creates a new GUID every time therefore if used within an Orchestrator function, it will cause it to behave non-deterministic way. In order to work around this, the IDurableOrchestrationContext provides the NewGuid method which creates a new GUID and provides the same across replays.

Environment variables or nonconstant static variables

Environment variable values can change over time and so is the case with nonconstant static variables, resulting in non-deterministic behavior of orchestration functions. Instead, these should be referenced from within client or activity functions.

Threading APIs

The Durable Task Framework runs orchestrator code on a single thread and can't interact with any other threads. Introducing new threads into an orchestration's execution can result in nondeterministic execution or deadlocks. If such APIs are necessary, limit their use to only activity functions.

Input and output values for Durable Functions

Orchestrator functions can invoke other orchestrator functions (as sub-orchestrators) or Activity functions where input and output values can be passed. For each of these functions that participate in the complete process, the input or output values need to be JSON serializable as these are persisted to the orchestration history table in Azure Table storage.

For working with data of type stream or file or bytes, it is preferred to store them in the Azure Storage and use them through activity functions. On a smaller scale, they can be converted to base64 strings and used as input or output values to pass around the functions.

Single-Threaded behaviour of Orchestrator Functions

The Durable Task Framework runs orchestrator code on a single thread. It can't interact with any other threads that might be called by other async APIs or those which are created without the use of IDurableOrchestrationContext. This single-threaded behavior of orchestrator functions allows them to be deterministic and in turn, ensures that the replay pattern works correctly and reliably.

Therefore any Async API or threading API calls need to be done inside of Activity functions and any blocking logic should use alternatives like Durable Timer.

I/O operations in orchestrator functions

Because of the deterministic nature of orchestrator functions and their ability to use replay patterns for going through a process, I/O operations are not allowed within their scope of execution. I/O operations can not only lead to different values being set or returned whenever replayed, but they can also cause corruption of state at the source or destination. Moreover, such operations can lead to scaling and performance problems.

Therefore, when there is a need to perform I/O operations, the Activity functions can be invoked which contain the I/O logic. The input/output data for/from these activity functions can be moved around using the Azure Storage connected to the Durable function.

HTTP endpoint calls in Durable Functions

Durable Functions 2.0 onwards, orchestrator functions can be used to invoke HTTP APIs. HTTP requests are sent by orchestrator functions and their responses are serialized and persisted as messages in the Durable Functions storage provider which ensures reliable and safe orchestration replay. However, the HTTP APIs that are created and invoked from the orchestrator functions do not support all of the HTTP request features that are possible from a native HttpClient. Some limitations are,

Additional latency is observed with the HTTP request made from within orchestrator functions and customization of the HttpClient provided by the DurableOrchestrationContext is also limited.
Orchestration performance also tends to get degraded with the large request or response messages as large messages need to be compressed/decompressed and stored/retrieved in/from blobs instead of queues.
Payloads of Stream or byte type are not supported.
Calls that use Azure AD tokens for authorization are only supported.

If any of these limitations affect your solution, it is advisable to use activity functions with customized HTTP clients to make calls.

Identity management in durable functions

APIs invoked within Durable functions can only take in Azure Active Directory (Azure AD) tokens acquired through Azure managed identities for authorization. The Orchestration context has the capability to acquire the OAuth 2.0 token, attach it to the associated HTTP request as a bearer token and handle the refresh and dispose of these tokens implicitly.

This makes it a boon for those who are in a position to use Azure AD tokens for their use cases however if the API or service being invoked from the Durable functions uses identity management different of Azure managed identities, then the process to obtain, maintain and refresh the tokens need to be performed as part of request execution and all of this has to be done within the scope of activity functions.

So these are some of the constraints of Azure Durable Functions to keep in mind while designing a solution using it. Durable Functions solve crucial use cases using a serverless approach. It is just a matter of understanding whether the use case you have chosen to solve using the Durable Functions fits with the given set of constraints.