How to Implement API Rate Limiting Using Middleware in ASP.NET Core?

Aarav Patel
14h
1.4k
0
1

Article

API Rate Limiting is a critical mechanism that controls the number of requests a client can make to an API within a specified time window. In modern ASP.NET Core Web API applications, especially those exposed publicly over the internet, rate limiting protects against abuse, denial-of-service attacks, brute-force attempts, and excessive resource consumption. Implementing rate limiting using middleware allows centralized control over request throttling and ensures consistent enforcement across all endpoints.

This article explains how to implement API rate limiting with middleware in ASP.NET Core, covering internal concepts, algorithms, practical implementation, real-world scenarios, advantages and disadvantages, and production best practices.

What Is API Rate Limiting?

API Rate Limiting restricts the number of HTTP requests a client (identified by IP address, API key, or user identity) can send within a defined time period.

In simple terms, it sets a speed limit for API usage.

Example:

100 requests per minute per IP
1,000 requests per hour per API key
10 login attempts per 5 minutes per user

If a client exceeds the limit, the server responds with HTTP 429 (Too Many Requests).

Why Rate Limiting Is Important in Cloud Applications

In cloud-hosted ASP.NET Core applications:

APIs may be publicly accessible.
Bots can flood endpoints.
Attackers may attempt brute-force logins.
A single client may consume excessive CPU or database resources.

Without rate limiting, one malicious or misconfigured client can degrade service for all users.

Real-World Analogy

Think of an API like a highway toll booth.

If one vehicle keeps circling and blocking the booth, others cannot pass. Rate limiting acts like traffic control, ensuring fair access to all drivers.

Common Rate Limiting Algorithms

Before implementing middleware, it is important to understand common strategies.

1. Fixed Window

Allows a fixed number of requests in a specific time window.

Example: 100 requests per minute.

2. Sliding Window

Tracks requests over a rolling time period for smoother control.

3. Token Bucket

Tokens are added at a fixed rate. Each request consumes a token.

4. Leaky Bucket

Requests are processed at a fixed rate regardless of burst traffic.

Difference Between Rate Limiting Algorithms

Feature	Fixed Window	Sliding Window	Token Bucket	Leaky Bucket
Burst Handling	Weak	Moderate	Strong	Moderate
Implementation Complexity	Simple	Medium	Medium	Medium
Memory Usage	Low	Higher	Moderate	Moderate
Accuracy	Basic	High	High	Controlled
Suitable For	Basic APIs	Public APIs	High-traffic APIs	Stream control
Traffic Smoothness	Poor	Good	Very Good	Very Good

For most applications, Fixed Window or Token Bucket is sufficient.

Implementing Rate Limiting Using Built-In ASP.NET Core Middleware

Modern ASP.NET Core includes built-in rate limiting support.

Step 1: Add Rate Limiting Service

In Program.cs:

builder.Services.AddRateLimiter(options =>
{
    options.AddFixedWindowLimiter("fixed", opt =>
    {
        opt.PermitLimit = 100;
        opt.Window = TimeSpan.FromMinutes(1);
        opt.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
        opt.QueueLimit = 2;
    });
});

Step 2: Enable Middleware

app.UseRateLimiter();

Step 3: Apply Policy to Endpoints

app.MapControllers().RequireRateLimiting("fixed");

Now each client is limited to 100 requests per minute.

If exceeded, the API returns:

HTTP 429 Too Many Requests

Custom Rate Limiting Middleware (Manual Implementation)

For educational purposes, here is a simplified custom middleware example.

public class SimpleRateLimitingMiddleware
{
    private readonly RequestDelegate _next;
    private static readonly Dictionary<string, (int Count, DateTime Timestamp)> _requests = new();

    public SimpleRateLimitingMiddleware(RequestDelegate next)
    {
        _next = next;
    }

    public async Task InvokeAsync(HttpContext context)
    {
        var ip = context.Connection.RemoteIpAddress?.ToString() ?? "unknown";

        lock (_requests)
        {
            if (_requests.ContainsKey(ip))
            {
                var (count, timestamp) = _requests[ip];

                if ((DateTime.UtcNow - timestamp).TotalMinutes < 1)
                {
                    if (count >= 100)
                    {
                        context.Response.StatusCode = StatusCodes.Status429TooManyRequests;
                        return;
                    }

                    _requests[ip] = (count + 1, timestamp);
                }
                else
                {
                    _requests[ip] = (1, DateTime.UtcNow);
                }
            }
            else
            {
                _requests[ip] = (1, DateTime.UtcNow);
            }
        }

        await _next(context);
    }
}

app.UseMiddleware<SimpleRateLimitingMiddleware>();

Note: This approach is not suitable for distributed systems because it uses in-memory storage.

Real Business Scenario: Public E-Commerce API

Consider an e-commerce platform exposing public product APIs.

Without rate limiting:

Bots scrape product data.
Inventory endpoints get flooded.
Database CPU spikes.

With rate limiting:

Each IP gets 200 requests per minute.
Excess traffic receives HTTP 429.
API remains stable.

Distributed Rate Limiting in Cloud Environments

In microservices deployed across multiple instances:

In-memory rate limiting fails.
Use distributed cache such as Redis.
API Gateway (e.g., Azure API Management) can enforce global throttling.

Architecture Example:

Client → API Gateway → ASP.NET Core API → Redis Counter → Response

This ensures consistent rate limits across instances.

Advantages of API Rate Limiting

Protects against abuse
Improves service availability
Prevents brute-force attacks
Controls infrastructure cost
Improves fairness among users

Disadvantages

May block legitimate burst traffic
Requires tuning based on traffic patterns
Adds additional complexity
Distributed implementation required for scaling

Common Mistakes Developers Make

Implementing in-memory rate limiting in load-balanced systems
Not returning HTTP 429 status code
Setting limits too low
Ignoring authentication-based limits
Not logging rate limit violations

When NOT to Use Strict Rate Limiting

Internal microservices communication
Trusted backend-to-backend communication
Low-traffic internal tools

However, even internal APIs benefit from some throttling safeguards.

Best Practices for Production

Use built-in ASP.NET Core rate limiting
Prefer token bucket for burst traffic
Implement distributed caching
Log and monitor 429 responses
Return Retry-After header
Apply different policies for authenticated and anonymous users
Combine with authentication and WAF rules

Enterprise Architecture Flow Example

Client → API Gateway → Rate Limiting Policy → ASP.NET Core Middleware → Controller → Database → Response

This layered defense ensures scalability and resilience.

FAQ

What status code is used for rate limiting?

HTTP 429 (Too Many Requests).

Can we apply different limits per endpoint?

Yes. Policies can be configured per route or controller.

Should rate limiting be implemented at API or gateway level?

For large systems, both levels can be used for layered protection.

Conclusion

The “Unable to resolve service for type” error in .NET Core occurs when the Dependency Injection container cannot construct a required dependency due to missing service registration, incorrect lifetime configuration, circular dependencies, constructor misconfiguration, or missing project references. Fixing this issue requires verifying interface-to-implementation mappings, aligning service lifetimes, avoiding primitive type injection, enabling scope validation, and maintaining proper architectural separation. By understanding how the ASP.NET Core DI container builds and resolves the dependency graph, developers can systematically diagnose and prevent this common runtime exception in enterprise .NET applications.