I’ll show a minimal working example (drop-in Program.cs
), explain placement, per-endpoint policies, custom rejection handling, testing, and how to scale with Redis for multi-node deployments. I’ll also call out common pitfalls and quick references.
Official samples and docs for ASP.NET Core’s built-in Rate Limiter are my primary sources below. Microsoft Learn+1
1. Prerequisites
using System.Threading.RateLimiting;
using Microsoft.AspNetCore.RateLimiting;
(Everything lives in the Microsoft.AspNetCore
ecosystem — no extra NuGet for basic use.)
2. Minimal working example (full Program.cs
)
using System.Threading.RateLimiting;
using Microsoft.AspNetCore.RateLimiting;
var builder = WebApplication.CreateBuilder(args);
// 1) Register rate limiting service and policies
builder.Services.AddRateLimiter(options =>
{
// set default rejection status code (optional)
options.RejectionStatusCode = StatusCodes.Status429TooManyRequests;
// global OnRejected handler (sets Retry-After when available)
options.OnRejected = (context, ct) =>
{
if (context.Lease.TryGetMetadata(MetadataName.RetryAfter, out var retryAfter))
{
context.HttpContext.Response.Headers.RetryAfter =
((int)retryAfter.TotalSeconds).ToString();
}
context.HttpContext.Response.StatusCode = StatusCodes.Status429TooManyRequests;
return ValueTask.CompletedTask;
};
// Example policy: fixed window limiter named "FixedApi"
options.AddFixedWindowLimiter("FixedApi", opt =>
{
opt.PermitLimit = 10; // 10 requests
opt.Window = TimeSpan.FromMinutes(1); // per 1 minute
opt.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
opt.QueueLimit = 0; // don't queue (reject when exhausted)
});
// Example token-bucket partitioned by IP for burst control (GlobalLimiter)
options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(context =>
{
var ip = context.Connection.RemoteIpAddress?.ToString() ?? "anon";
return RateLimitPartition.GetTokenBucketLimiter(ip, _ => new TokenBucketRateLimiterOptions
{
TokenLimit = 100,
TokensPerPeriod = 100,
ReplenishmentPeriod = TimeSpan.FromMinutes(1),
AutoReplenishment = true,
QueueProcessingOrder = QueueProcessingOrder.OldestFirst,
QueueLimit = 0
});
});
});
var app = builder.Build();
// Middleware ordering: authentication -> UseRateLimiter -> authorization
app.UseHttpsRedirection();
app.UseAuthentication();
app.UseRateLimiter(); // run limiter after auth (so policies can use auth info). Important.
app.UseAuthorization();
// Apply per-endpoint policy
app.MapGet("/open", () => "open endpoint (global limiter applies)");
app.MapGet("/limited", () => "this endpoint is fixed-window limited")
.RequireRateLimiting("FixedApi");
app.Run();
AddFixedWindowLimiter
creates a named policy that you attach to endpoints with .RequireRateLimiting("FixedApi")
. GlobalLimiter
runs before endpoint limiters and can apply a global IP-based token bucket. Microsoft Learn
3. Where to put UseRateLimiter()
(order)
4. Rejection handling & headers
You can set RejectionStatusCode
(default is 503
in some samples) and use OnRejected
to add Retry-After
or custom body. Example in the snippet sets 429 and uses MetadataName.RetryAfter
when available.
Note: The Concurrency limiter can’t calculate Retry-After
. Microsoft Learn
5. Quick policy recipes (APIs you’ll use)
options.AddFixedWindowLimiter("name", opts => { ... })
— fixed window (simple count per window).
options.AddSlidingWindowLimiter("name", opts => { ... })
— smoother rolling window.
RateLimitPartition.GetTokenBucketLimiter(key, factory)
— token bucket for bursts.
options.AddConcurrencyLimiter("name", opts => { ... })
— limit concurrent in-flight requests.
(These helpers are part of the built-in Rate Limiter system.) Microsoft Learn+1
6. Per-endpoint vs Global
7. Testing (quick curl examples)
Call limited endpoint repeatedly:
# hit fixed endpoint 12 times
for i in $(seq 1 12); do curl -i http://localhost:5000/limited; echo; done
8. Scaling to multiple nodes — Redis / distributed counters
App-level in-memory limiters work on a single instance. For multi-node deployments, you need a shared store (Redis is common). Options:
Other options: put global rate limits on CDN/API gateway (Cloudflare, Azure Front Door, AWS API Gateway) and use app policies for per-user quotas — defense in depth. (Many teams combine gateway + app policies.)
9. Production considerations & pitfalls
Partition key matters: partition by user id or API key for per-user quotas; IP partitioning can be spoofed — MS docs warn about IP-spoofing and DoS. Don’t blindly rely on IP for public endpoints. Microsoft Learn
Queue vs reject: queueing lets a few extra requests wait — good for burst smoothing. But high queue limits can lead to resource pressure.
Concurrency limiter: it controls concurrent in-flight requests but cannot compute Retry-After
. Microsoft Learn
Layering: push large-scale blocking to edge (WAF/CDN/API gateway) — apps should enforce per-user fairness and fine-grained limits.
Observability: emit metrics (Prometheus), log OnRejected events, track per-policy rejects so you can tune limits.
10. Useful references / further reading
Microsoft rate limiting overview + samples (core patterns, OnRejected
, GlobalLimiter
, sample code). Microsoft Learn+1
HttpClient
side rate limiting examples (client-side TokenBucket handlers). Microsoft Learn
Redis + distributed patterns and community repo for a Redis backplane for .NET rate limiting. GitHub+1