.NET Core  

Mastering .NET 8 Resilience Pipelines: Internals, Custom Strategies, and Production-Grade Patterns

Introduction

Most articles stop at “add retry and circuit breaker”.

Real systems fail in complex, nonlinear ways—timeouts interact with retries, retries amplify load, and circuit breakers can become global bottlenecks.

.NET 8’s Resilience Pipelines are not just a Polly replacement; they represent a new execution model for fault handling in ASP.NET Core.

This article goes beyond configuration and explores:

  • Pipeline execution internals

  • Strategy ordering and side effects

  • Custom resilience strategies

  • Per-request dynamic policies

  • Keyed pipelines and tenant isolation

  • Testing failure behavior deterministically

This is production-grade resilience engineering.

Mental Model: How a Resilience Pipeline Executes

A resilience pipeline is a decorated execution chain:

Request
 ↓
Timeout
 ↓
Retry
 ↓
Circuit Breaker
 ↓
Hedging
 ↓
Actual Operation

Key insight:

Order matters more than configuration

Example mistake:

  • Retry outside timeout → long-running retries

  • Timeout outside retry → retries never happen

Strategy Ordering: The Correct Default

Recommended Order

Hedging
 → Timeout
   → Retry
     → Circuit Breaker
       → Operation

Why?

  • Hedging needs early execution

  • Timeout limits total execution time

  • Retry handles transient faults

  • Circuit breaker observes final outcomes

Building a Fully Controlled Pipeline

builder.Services.AddResiliencePipeline<HttpResponseMessage>(
    "advanced-pipeline",
    pipeline =>
    {
        pipeline
            .AddHedging(new HedgingStrategyOptions<HttpResponseMessage>
            {
                MaxHedgedAttempts = 2,
                Delay = TimeSpan.FromMilliseconds(150)
            })
            .AddTimeout(TimeSpan.FromSeconds(2))
            .AddRetry(new RetryStrategyOptions<HttpResponseMessage>
            {
                MaxRetryAttempts = 2,
                ShouldHandle = new PredicateBuilder<HttpResponseMessage>()
                    .Handle<HttpRequestException>()
                    .HandleResult(r => !r.IsSuccessStatusCode)
            })
            .AddCircuitBreaker(new CircuitBreakerStrategyOptions<HttpResponseMessage>
            {
                FailureRatio = 0.4,
                MinimumThroughput = 20,
                SamplingDuration = TimeSpan.FromSeconds(30),
                BreakDuration = TimeSpan.FromSeconds(10)
            });
    });

This pipeline:

  • Sends parallel requests when latency spikes

  • Cancels execution deterministically

  • Stops traffic when downstream collapses

Keyed Pipelines: Tenant-Level Isolation (Advanced)

One of the least discussed but most powerful features is keyed pipelines.

Problem

  • A single failing tenant trips the circuit breaker

  • All tenants suffer

Solution: Keyed Pipelines

builder.Services.AddResiliencePipeline<string, HttpResponseMessage>(
    "tenant-pipeline",
    (context, pipeline) =>
    {
        var tenantId = context;

        pipeline.AddCircuitBreaker(new CircuitBreakerStrategyOptions
        {
            FailureRatio = tenantId == "premium" ? 0.7 : 0.3,
            BreakDuration = TimeSpan.FromSeconds(10)
        });
    });

Usage

await resiliencePipeline.ExecuteAsync(
    tenantId,
    async token => await httpClient.GetAsync(url, token));

✔ One tenant fails → others remain healthy
✔ Enterprise-grade isolation
✔ Zero gateway dependency

Per-Request Dynamic Resilience (Rarely Covered)

Sometimes resilience cannot be static.

Example:

  • POST → no retry

  • GET → retry allowed

  • Admin requests → no hedging

Dynamic Override Using Context

var context = new ResilienceContext
{
    Properties =
    {
        ["AllowRetry"] = false
    }
};

await pipeline.ExecuteAsync(context, async token =>
{
    return await httpClient.PostAsync(url, content, token);
});

Custom Retry Predicate

ShouldHandle = args =>
{
    if (args.Context.Properties.TryGetValue("AllowRetry", out var allow) &&
        allow is false)
        return ValueTask.FromResult(false);

    return ValueTask.FromResult(true);
};

This enables business-aware resilience, not blind retries.

Writing a Custom Resilience Strategy (Ultra-Advanced)

Scenario

You want to:

  • Block traffic during deployments

  • Allow only health checks

Custom Strategy

public sealed class DeploymentBlockStrategy : ResilienceStrategy
{
    protected override async ValueTask<TResult> ExecuteCore<TResult>(
        Func<ResilienceContext, ValueTask<TResult>> callback,
        ResilienceContext context)
    {
        if (DeploymentState.IsInProgress)
            throw new InvalidOperationException("Deployment in progress");

        return await callback(context);
    }
}

Registering It

pipeline.AddStrategy(new DeploymentBlockStrategy());

This is framework-level extensibility, not middleware hacks.

Deterministic Testing of Failure Scenarios

Problem

Resilience code is notoriously hard to test.

Solution: Virtual Time & Controlled Failures

var attempts = 0;

var pipeline = new ResiliencePipelineBuilder()
    .AddRetry(new RetryStrategyOptions
    {
        MaxRetryAttempts = 3
    })
    .Build();

await Assert.ThrowsAsync<HttpRequestException>(() =>
    pipeline.ExecuteAsync(async _ =>
    {
        attempts++;
        throw new HttpRequestException();
    }));

Assert.Equal(4, attempts); // Initial + retries

You can now:

  • Assert retry counts

  • Assert circuit breaker transitions

  • Validate timeouts deterministically

Observability: Signals That Actually Matter

Forget logging “retry happened.”

Track:

  • Retry amplification factor

  • Circuit breaker open duration

  • Hedged request waste ratio

  • Timeout vs cancellation ratio

Hooking into Events

pipeline.OnRetry(args =>
{
    logger.LogWarning(
        "Retry {Attempt} due to {Reason}",
        args.AttemptNumber,
        args.Outcome.Exception?.Message);
});

Common Anti-Patterns (Seen in Production)

  • Retrying POST / PUT

  • Long retry delays

  • Global circuit breakers

  • Hedging on write operations

  • Timeouts longer than SLA

When Resilience Pipelines Change Architecture Decisions

With proper pipelines:

  • API Gateways become thinner

  • Fewer background queues needed

  • Faster recovery under partial outages

  • Better SLO compliance

This isn’t just a library—it’s an architectural primitive.

Final Thoughts

.NET 8 Resilience Pipelines are:

  • Composable

  • Context-aware

  • Extensible

  • Enterprise-ready

Most developers will use 10% of their power.

The remaining 90% is where high-scale, fault-tolerant systems are built.

If you master this, you’re not just writing APIs—you’re designing resilient distributed systems.