Web API  

Multi-Layered Rate Limiting (User-Level, IP-Level, API-Level)

Introduction

Modern applications are exposed to high traffic, automation scripts, spikes, and sometimes malicious abuse.
If every request is processed equally, no matter who sends it, the system becomes unstable.

A multi-layered rate limiting architecture prevents overload by enforcing limits at:

  • IP Layer

  • User Identity Layer

  • Endpoint/API Layer

  • Tenant/Subscription Layer

  • Global Platform Layer

This ensures fairness, protects critical resources, prevents denial-of-service issues, and aligns system usage with business plans (free users vs enterprise).

The goal is to design an architecture where limits are configurable, enforced with low latency, and flexible enough to support:

  • Sliding window

  • Token bucket

  • Fixed window

  • Burst handling

  • Grace periods

  • Dynamic throttling based on load

This article describes a production-focused implementation using:

  • SQL for configuration

  • Redis for fast counters

  • .NET API enforcement

  • Angular frontend feedback and retry UI

Architecture Overview

 ┌───────────────────────────────┐
 │           Client              │
 └───────┬────────┬─────────────┘
         │        │
         ▼        ▼
 ┌──────────────┐ ┌─────────────────┐
 │ Angular UI   │ │ Retry/Backoff UI│
 └───────┬──────┘ └─────────────────┘
         │
         ▼
 ┌──────────────────────┐
 │ API Gateway / Filter │
 └───────────┬──────────┘
             │
             ▼
 ┌─────────────────────────────────────┐
 │ Rate Limit Middleware (Tiered Rules)│
 └──────┬─────────────┬───────────────┘
        │             │
        ▼             ▼
 ┌─────────────┐  ┌─────────────────────┐
 │ Redis Cache │  │ SQL Metadata Store  │
 └──────┬──────┘  └─────────────┬───────┘
        │                        │
        └───────────┬────────────┘
                    ▼
             ┌──────────────┐
             │ API Handler  │
             └──────────────┘

Strategy Layers

1. IP-Based Limits

Purpose: Block bots, scrapers, unknown traffic.

Examples:

  • 100 requests/minute per IP

  • Stricter rule for anonymous traffic

2. User-Level Limits

Applied after login.

Examples:

  • Free user: 500 requests/day

  • Paid license: unlimited except heavy API endpoints

3. API Endpoint-Level Limits

Some operations are costlier than others.

Examples:

EndpointLimit
/auth/login5/minute
/search/global25/minute
/download/report3/hour

4. Tenant-Level Limits

Multi-tenant applications need business controls.

Example:

  • SaaS plan limits calls per tenant per day

5. System-Level Safety Throttle

When traffic spike occurs, limits tighten dynamically.

Metadata Model (SQL)

CREATE TABLE RateLimitPolicy (
    PolicyId UNIQUEIDENTIFIER PRIMARY KEY,
    Scope NVARCHAR(50),  -- IP, USER, API, TENANT, GLOBAL
    Target NVARCHAR(200), -- endpoint or wildcard
    LimitCount INT,
    WindowSeconds INT,
    BurstAllowed BIT DEFAULT 0
);

Example entries

ScopeTargetLimitWindow (s)
IP*100/min60
USER/search30/min60
API/download/report3/hour3600
TENANT*10k/day86400
GLOBAL*1M/hour3600

Redis Counter Model

Redis stores counters temporarily for speed:

Key format:

ratelimit:{scope}:{target}:{identifier}:{timeWindowBucket}

Example:

ratelimit:USER:/search:user-34:2025-11-21-18:30

Value increments with each request and expires automatically.

Enforcement Pipeline (.NET)

Middleware checks with decreasing priority:

Global → Tenant → API Endpoint → User → IP

First violating rule blocks the request.

Sample Middleware Snippet

public async Task Invoke(HttpContext context)
{
    var rules = await _policyStore.GetRules(context);

    foreach (var rule in rules)
    {
        var result = await _limiter.Check(rule, context);

        if (!result.Allowed)
        {
            context.Response.StatusCode = StatusCodes.Status429TooManyRequests;
            context.Response.Headers["Retry-After"] = result.RetryAfterSeconds.ToString();
            await context.Response.WriteAsync("Rate limit exceeded.");
            return;
        }
    }

    await _next(context);
}

Angular Handling

Detect HTTP 429

this.http.get(url).subscribe({
  next: data => this.handle(data),
  error: err => {
    if (err.status === 429) {
      this.showRetryMessage(err.headers.get('Retry-After'));
    }
  }
});

User UI Behavior

  • Show countdown timer

  • Auto-retry for background calls

  • Backoff delay mechanism

Example:

"You have reached the limit. Try again in 38 seconds."

Burst Mode Allowance

Burst allows short high-volume usage but smooths long-term usage.

Technique: Token Bucket Algorithm

Redis keys:

bucket:{user}:{endpoint}

Tokens refill every cycle.

Sliding vs Fixed Window

ModeBenefit
Fixed WindowSimpler, predictable
Sliding WindowFairer but heavier compute
Token BucketSmooth traffic, burst friendly

Most production systems combine:

  • Fixed window for security

  • Token bucket for user experience

System Observation Metrics

Track:

  • Requests allowed

  • Requests blocked

  • Rules tripping most

  • Spikes and anomalies

Stored in:

  • SQL (history)

  • ElasticSearch (queries)

  • Prometheus/Grafana (dashboards)

Audit Log Entry Example

{"timestamp": "2025-11-21T18:52:11Z","rule": "USER-/search","identifier": "user-34","blocked": true,"limit": 30,"windowSeconds": 60}

Performance Considerations

  • Redis Lua scripts should be used for atomic increments

  • Prevent key explosion by normalizing identifiers

  • Add circuit breakers for Redis outages (fallback rules)

Fallback strategy:

If Redis Down → Temporary Strict System Safety Limit

Real-World Enhancements

FeatureDescription
Dynamic ThrottlingLimits tighten when CPU or DB load increases
Plan-Aware LimitsDifferent rules based on subscription
WhitelistingPartner systems bypass limits
Application-Level RetryUI gracefully handles throttling

Example Use Case

A SaaS analytics platform has:

  • Thousands of tenants

  • Millions of queries

  • Expensive search endpoints

Without rate limiting:

  • Few tenants could overload compute resources

  • API costs rise

  • System availability suffers

With multi-layered enforcement:

  • Each tenant has fair usage

  • Critical endpoints are protected

  • System remains predictable during spikes

Summary

A single rate limiting rule is not enough for enterprise systems.
A layered strategy allows fine-grained control, aligned to business rules and system stability.

A mature system applies:

  • Per-IP protection against public noise

  • Per-user fairness

  • Per-endpoint safety

  • Per-tenant business rules

  • Global throttling as a safety shield

With Redis for counters, SQL for metadata, .NET for enforcement, and Angular for user experience, the solution becomes scalable, configurable, and secure.