ASP.NET Core  

How to Implement API Rate Limiting Using Middleware in ASP.NET Core?

API Rate Limiting is a critical mechanism that controls the number of requests a client can make to an API within a specified time window. In modern ASP.NET Core Web API applications, especially those exposed publicly over the internet, rate limiting protects against abuse, denial-of-service attacks, brute-force attempts, and excessive resource consumption. Implementing rate limiting using middleware allows centralized control over request throttling and ensures consistent enforcement across all endpoints.

This article explains how to implement API rate limiting with middleware in ASP.NET Core, covering internal concepts, algorithms, practical implementation, real-world scenarios, advantages and disadvantages, and production best practices.

What Is API Rate Limiting?

API Rate Limiting restricts the number of HTTP requests a client (identified by IP address, API key, or user identity) can send within a defined time period.

In simple terms, it sets a speed limit for API usage.

Example:

  • 100 requests per minute per IP

  • 1,000 requests per hour per API key

  • 10 login attempts per 5 minutes per user

If a client exceeds the limit, the server responds with HTTP 429 (Too Many Requests).

Why Rate Limiting Is Important in Cloud Applications

In cloud-hosted ASP.NET Core applications:

  • APIs may be publicly accessible.

  • Bots can flood endpoints.

  • Attackers may attempt brute-force logins.

  • A single client may consume excessive CPU or database resources.

Without rate limiting, one malicious or misconfigured client can degrade service for all users.

Real-World Analogy

Think of an API like a highway toll booth.

If one vehicle keeps circling and blocking the booth, others cannot pass. Rate limiting acts like traffic control, ensuring fair access to all drivers.

Common Rate Limiting Algorithms

Before implementing middleware, it is important to understand common strategies.

1. Fixed Window

Allows a fixed number of requests in a specific time window.

Example: 100 requests per minute.

2. Sliding Window

Tracks requests over a rolling time period for smoother control.

3. Token Bucket

Tokens are added at a fixed rate. Each request consumes a token.

4. Leaky Bucket

Requests are processed at a fixed rate regardless of burst traffic.

Difference Between Rate Limiting Algorithms

FeatureFixed WindowSliding WindowToken BucketLeaky Bucket
Burst HandlingWeakModerateStrongModerate
Implementation ComplexitySimpleMediumMediumMedium
Memory UsageLowHigherModerateModerate
AccuracyBasicHighHighControlled
Suitable ForBasic APIsPublic APIsHigh-traffic APIsStream control
Traffic SmoothnessPoorGoodVery GoodVery Good

For most applications, Fixed Window or Token Bucket is sufficient.

Implementing Rate Limiting Using Built-In ASP.NET Core Middleware

Modern ASP.NET Core includes built-in rate limiting support.

Step 1: Add Rate Limiting Service

In Program.cs:

builder.Services.AddRateLimiter(options =>
{
    options.AddFixedWindowLimiter("fixed", opt =>
    {
        opt.PermitLimit = 100;
        opt.Window = TimeSpan.FromMinutes(1);
        opt.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
        opt.QueueLimit = 2;
    });
});

Step 2: Enable Middleware

app.UseRateLimiter();

Step 3: Apply Policy to Endpoints

app.MapControllers().RequireRateLimiting("fixed");

Now each client is limited to 100 requests per minute.

If exceeded, the API returns:

HTTP 429 Too Many Requests

Custom Rate Limiting Middleware (Manual Implementation)

For educational purposes, here is a simplified custom middleware example.

public class SimpleRateLimitingMiddleware
{
    private readonly RequestDelegate _next;
    private static readonly Dictionary<string, (int Count, DateTime Timestamp)> _requests = new();

    public SimpleRateLimitingMiddleware(RequestDelegate next)
    {
        _next = next;
    }

    public async Task InvokeAsync(HttpContext context)
    {
        var ip = context.Connection.RemoteIpAddress?.ToString() ?? "unknown";

        lock (_requests)
        {
            if (_requests.ContainsKey(ip))
            {
                var (count, timestamp) = _requests[ip];

                if ((DateTime.UtcNow - timestamp).TotalMinutes < 1)
                {
                    if (count >= 100)
                    {
                        context.Response.StatusCode = StatusCodes.Status429TooManyRequests;
                        return;
                    }

                    _requests[ip] = (count + 1, timestamp);
                }
                else
                {
                    _requests[ip] = (1, DateTime.UtcNow);
                }
            }
            else
            {
                _requests[ip] = (1, DateTime.UtcNow);
            }
        }

        await _next(context);
    }
}

Register middleware:

app.UseMiddleware<SimpleRateLimitingMiddleware>();

Note: This approach is not suitable for distributed systems because it uses in-memory storage.

Real Business Scenario: Public E-Commerce API

Consider an e-commerce platform exposing public product APIs.

Without rate limiting:

  • Bots scrape product data.

  • Inventory endpoints get flooded.

  • Database CPU spikes.

With rate limiting:

  • Each IP gets 200 requests per minute.

  • Excess traffic receives HTTP 429.

  • API remains stable.

Distributed Rate Limiting in Cloud Environments

In microservices deployed across multiple instances:

  • In-memory rate limiting fails.

  • Use distributed cache such as Redis.

  • API Gateway (e.g., Azure API Management) can enforce global throttling.

Architecture Example:

Client → API Gateway → ASP.NET Core API → Redis Counter → Response

This ensures consistent rate limits across instances.

Advantages of API Rate Limiting

  • Protects against abuse

  • Improves service availability

  • Prevents brute-force attacks

  • Controls infrastructure cost

  • Improves fairness among users

Disadvantages

  • May block legitimate burst traffic

  • Requires tuning based on traffic patterns

  • Adds additional complexity

  • Distributed implementation required for scaling

Common Mistakes Developers Make

  • Implementing in-memory rate limiting in load-balanced systems

  • Not returning HTTP 429 status code

  • Setting limits too low

  • Ignoring authentication-based limits

  • Not logging rate limit violations

When NOT to Use Strict Rate Limiting

  • Internal microservices communication

  • Trusted backend-to-backend communication

  • Low-traffic internal tools

However, even internal APIs benefit from some throttling safeguards.

Best Practices for Production

  • Use built-in ASP.NET Core rate limiting

  • Prefer token bucket for burst traffic

  • Implement distributed caching

  • Log and monitor 429 responses

  • Return Retry-After header

  • Apply different policies for authenticated and anonymous users

  • Combine with authentication and WAF rules

Enterprise Architecture Flow Example

Client → API Gateway → Rate Limiting Policy → ASP.NET Core Middleware → Controller → Database → Response

This layered defense ensures scalability and resilience.

FAQ

What status code is used for rate limiting?

HTTP 429 (Too Many Requests).

Can we apply different limits per endpoint?

Yes. Policies can be configured per route or controller.

Should rate limiting be implemented at API or gateway level?

For large systems, both levels can be used for layered protection.

Conclusion

The “Unable to resolve service for type” error in .NET Core occurs when the Dependency Injection container cannot construct a required dependency due to missing service registration, incorrect lifetime configuration, circular dependencies, constructor misconfiguration, or missing project references. Fixing this issue requires verifying interface-to-implementation mappings, aligning service lifetimes, avoiding primitive type injection, enabling scope validation, and maintaining proper architectural separation. By understanding how the ASP.NET Core DI container builds and resolves the dependency graph, developers can systematically diagnose and prevent this common runtime exception in enterprise .NET applications.