I’ll show a minimal working example (drop-in Program.cs), explain placement, per-endpoint policies, custom rejection handling, testing, and how to scale with Redis for multi-node deployments. I’ll also call out common pitfalls and quick references.
Official samples and docs for ASP.NET Core’s built-in Rate Limiter are my primary sources below. Microsoft Learn+1
1. Prerequisites
using System.Threading.RateLimiting;
using Microsoft.AspNetCore.RateLimiting;
(Everything lives in the Microsoft.AspNetCore ecosystem — no extra NuGet for basic use.)
2. Minimal working example (full Program.cs)
using System.Threading.RateLimiting;
using Microsoft.AspNetCore.RateLimiting;
var builder = WebApplication.CreateBuilder(args);
// 1) Register rate limiting service and policies
builder.Services.AddRateLimiter(options =>
{
// set default rejection status code (optional)
options.RejectionStatusCode = StatusCodes.Status429TooManyRequests;
// global OnRejected handler (sets Retry-After when available)
options.OnRejected = (context, ct) =>
{
if (context.Lease.TryGetMetadata(MetadataName.RetryAfter, out var retryAfter))
{
context.HttpContext.Response.Headers.RetryAfter =
((int)retryAfter.TotalSeconds).ToString();
}
context.HttpContext.Response.StatusCode = StatusCodes.Status429TooManyRequests;
return ValueTask.CompletedTask;
};
// Example policy: fixed window limiter named "FixedApi"
options.AddFixedWindowLimiter("FixedApi", opt =>
{
opt.PermitLimit = 10; // 10 requests
opt.Window = TimeSpan.FromMinutes(1); // per 1 minute
opt.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
opt.QueueLimit = 0; // don't queue (reject when exhausted)
});
// Example token-bucket partitioned by IP for burst control (GlobalLimiter)
options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(context =>
{
var ip = context.Connection.RemoteIpAddress?.ToString() ?? "anon";
return RateLimitPartition.GetTokenBucketLimiter(ip, _ => new TokenBucketRateLimiterOptions
{
TokenLimit = 100,
TokensPerPeriod = 100,
ReplenishmentPeriod = TimeSpan.FromMinutes(1),
AutoReplenishment = true,
QueueProcessingOrder = QueueProcessingOrder.OldestFirst,
QueueLimit = 0
});
});
});
var app = builder.Build();
// Middleware ordering: authentication -> UseRateLimiter -> authorization
app.UseHttpsRedirection();
app.UseAuthentication();
app.UseRateLimiter(); // run limiter after auth (so policies can use auth info). Important.
app.UseAuthorization();
// Apply per-endpoint policy
app.MapGet("/open", () => "open endpoint (global limiter applies)");
app.MapGet("/limited", () => "this endpoint is fixed-window limited")
.RequireRateLimiting("FixedApi");
app.Run();
AddFixedWindowLimiter creates a named policy that you attach to endpoints with .RequireRateLimiting("FixedApi"). GlobalLimiter runs before endpoint limiters and can apply a global IP-based token bucket. Microsoft Learn
3. Where to put UseRateLimiter() (order)
4. Rejection handling & headers
You can set RejectionStatusCode (default is 503 in some samples) and use OnRejected to add Retry-After or custom body. Example in the snippet sets 429 and uses MetadataName.RetryAfter when available.
Note: The Concurrency limiter can’t calculate Retry-After. Microsoft Learn
5. Quick policy recipes (APIs you’ll use)
options.AddFixedWindowLimiter("name", opts => { ... }) — fixed window (simple count per window).
options.AddSlidingWindowLimiter("name", opts => { ... }) — smoother rolling window.
RateLimitPartition.GetTokenBucketLimiter(key, factory) — token bucket for bursts.
options.AddConcurrencyLimiter("name", opts => { ... }) — limit concurrent in-flight requests.
(These helpers are part of the built-in Rate Limiter system.) Microsoft Learn+1
6. Per-endpoint vs Global
7. Testing (quick curl examples)
Call limited endpoint repeatedly:
# hit fixed endpoint 12 times
for i in $(seq 1 12); do curl -i http://localhost:5000/limited; echo; done
8. Scaling to multiple nodes — Redis / distributed counters
App-level in-memory limiters work on a single instance. For multi-node deployments, you need a shared store (Redis is common). Options:
Other options: put global rate limits on CDN/API gateway (Cloudflare, Azure Front Door, AWS API Gateway) and use app policies for per-user quotas — defense in depth. (Many teams combine gateway + app policies.)
9. Production considerations & pitfalls
Partition key matters: partition by user id or API key for per-user quotas; IP partitioning can be spoofed — MS docs warn about IP-spoofing and DoS. Don’t blindly rely on IP for public endpoints. Microsoft Learn
Queue vs reject: queueing lets a few extra requests wait — good for burst smoothing. But high queue limits can lead to resource pressure.
Concurrency limiter: it controls concurrent in-flight requests but cannot compute Retry-After. Microsoft Learn
Layering: push large-scale blocking to edge (WAF/CDN/API gateway) — apps should enforce per-user fairness and fine-grained limits.
Observability: emit metrics (Prometheus), log OnRejected events, track per-policy rejects so you can tune limits.
10. Useful references / further reading
Microsoft rate limiting overview + samples (core patterns, OnRejected, GlobalLimiter, sample code). Microsoft Learn+1
HttpClient side rate limiting examples (client-side TokenBucket handlers). Microsoft Learn
Redis + distributed patterns and community repo for a Redis backplane for .NET rate limiting. GitHub+1