ASP.NET Core  

How to Implement API Rate Limiting in ASP.NET Core Using Middleware

Introduction

When building production-grade APIs in ASP.NET Core, uncontrolled traffic is one of the most common causes of performance degradation and security exposure. High-frequency requests—whether from legitimate spikes, poorly implemented clients, or malicious bots—can overwhelm your application.

API rate limiting introduces a controlled access policy: a client is allowed a fixed number of requests within a defined time window. Once the threshold is exceeded, further requests are temporarily rejected.

This guide demonstrates how to implement API rate limiting using custom middleware in ASP.NET Core, along with practical reasoning, real-world scenarios, and production considerations.

What is API Rate Limiting?

API rate limiting is a mechanism that restricts the number of API calls a client can make over a specified duration. It is a foundational aspect of ASP.NET Core API security and performance optimization.

Real-world analogy

Consider a customer support line that allows only a limited number of calls per minute per user. This ensures fair access and prevents system overload.

Why it matters

  • Prevents API abuse and brute-force attacks

  • Maintains consistent application performance

  • Protects backend resources (CPU, memory, DB)

  • Ensures fair usage across clients

How API Rate Limiting Works Internally

At a system level, rate limiting tracks:

  • Client identity (IP address, API key, or user ID)

  • Number of requests made

  • Time window for those requests

Based on these values, the system decides whether to:

  • Allow the request

  • Reject the request with HTTP 429 (Too Many Requests)

Step 1: Create a Rate Limit Options Class

public class RateLimitOptions
{
    public int MaxRequests { get; set; } = 5;
    public TimeSpan TimeWindow { get; set; } = TimeSpan.FromMinutes(1);
}

Explanation

This configuration class defines the rate limiting policy:

  • MaxRequests: Maximum number of allowed requests within the time window

  • TimeWindow: Duration in which those requests are counted

This abstraction allows easy configuration changes without modifying middleware logic.

Step 2: Create a Request Tracking Model

public class ClientRequestInfo
{
    public int RequestCount { get; set; }
    public DateTime WindowStart { get; set; }
}

Explanation

This model stores request metadata per client:

  • RequestCount: Tracks how many requests have been made

  • WindowStart: Marks the beginning of the current time window

This enables accurate reset logic once the time window expires.

Step 3: Implement the Rate Limiting Middleware

using System.Collections.Concurrent;

public class RateLimitingMiddleware
{
    private readonly RequestDelegate _next;
    private readonly RateLimitOptions _options;
    private static readonly ConcurrentDictionary<string, ClientRequestInfo> _clients = new();

    public RateLimitingMiddleware(RequestDelegate next, RateLimitOptions options)
    {
        _next = next;
        _options = options;
    }

    public async Task InvokeAsync(HttpContext context)
    {
        var clientIp = context.Connection.RemoteIpAddress?.ToString();

        if (string.IsNullOrEmpty(clientIp))
        {
            await _next(context);
            return;
        }

        var client = _clients.GetOrAdd(clientIp, _ => new ClientRequestInfo
        {
            RequestCount = 0,
            WindowStart = DateTime.UtcNow
        });

        lock (client)
        {
            if (DateTime.UtcNow - client.WindowStart > _options.TimeWindow)
            {
                client.RequestCount = 0;
                client.WindowStart = DateTime.UtcNow;
            }

            client.RequestCount++;

            if (client.RequestCount > _options.MaxRequests)
            {
                context.Response.StatusCode = 429;
                context.Response.Headers["Retry-After"] = _options.TimeWindow.TotalSeconds.ToString();
                context.Response.WriteAsync("Too many requests. Try again later.");
                return;
            }
        }

        await _next(context);
    }
}

Explanation

This middleware executes on every request and applies the rate limiting logic:

Client Identification

  • Uses RemoteIpAddress to uniquely identify the requester

Data Storage

  • Uses ConcurrentDictionary to safely store request data in multi-threaded scenarios

Window Management

  • Checks whether the current time window has expired

  • Resets request count if needed

Enforcement

  • Increments request count

  • If limit is exceeded:

    • Returns HTTP 429

    • Adds Retry-After header to guide clients

Thread Safety

  • Uses lock to ensure consistent updates for each client

Step 4: Register Middleware in Program.cs

var builder = WebApplication.CreateBuilder(args);

var app = builder.Build();

var rateLimitOptions = new RateLimitOptions
{
    MaxRequests = 5,
    TimeWindow = TimeSpan.FromMinutes(1)
};

app.UseMiddleware<RateLimitingMiddleware>(rateLimitOptions);

app.MapControllers();

app.Run();

Explanation

  • Middleware is injected into the HTTP pipeline using UseMiddleware

  • It executes before controllers

  • Every incoming request is evaluated against rate limiting rules

Step 5: Testing the Implementation

Tools

  • Postman

  • Curl

  • Browser

Expected Behavior

  • First 5 requests → Allowed

  • Additional requests → Blocked with HTTP 429

This confirms that the ASP.NET Core rate limiting middleware is functioning correctly.

Before vs After Impact

Without Rate Limiting

  • Increased server load

  • Higher latency

  • Vulnerability to abuse

With Rate Limiting

  • Controlled request flow

  • Improved stability

  • Enhanced API security

Common Pitfalls

Using Only IP Address

  • Multiple users may share the same IP

  • Leads to unintended blocking

In-Memory Storage Limitation

  • Data is lost on application restart

  • Not suitable for distributed systems

Missing Retry Headers

  • Clients lack guidance on when to retry

Over-restricting Endpoints

  • Critical endpoints (health checks) should remain unrestricted

Production Considerations

Recommended Enhancements

  • Use Redis for distributed rate limiting

  • Implement advanced algorithms (sliding window, token bucket)

  • Apply per-user and per-endpoint policies

  • Integrate logging and monitoring

Built-in Rate Limiting in ASP.NET Core (.NET 7+)

builder.Services.AddRateLimiter(options =>
{
    options.AddFixedWindowLimiter("fixed", opt =>
    {
        opt.PermitLimit = 5;
        opt.Window = TimeSpan.FromMinutes(1);
    });
});

app.UseRateLimiter();

Explanation

  • Uses built-in middleware provided by ASP.NET Core

  • More scalable and optimized compared to custom solutions

  • Supports multiple algorithms and configurations

Advantages

  • Protects APIs from misuse

  • Stabilizes performance under load

  • Improves overall reliability

Limitations

  • Requires proper tuning to avoid blocking valid users

  • In-memory implementations do not scale horizontally

Summary

API rate limiting in ASP.NET Core is essential for maintaining application stability, security, and fairness in high-traffic environments. By implementing a middleware-based solution, developers gain fine-grained control over request handling. While the custom approach is suitable for learning and smaller applications, production systems should adopt distributed strategies such as Redis-backed rate limiting or the built-in ASP.NET Core rate limiter for scalability and resilience.