Multi-Layered Rate Limiting (User-Level, IP-Level, API-Level)

Rajesh Gami
4h
90
0
0

Article

Introduction

Modern applications are exposed to high traffic, automation scripts, spikes, and sometimes malicious abuse.
If every request is processed equally, no matter who sends it, the system becomes unstable.

A multi-layered rate limiting architecture prevents overload by enforcing limits at:

IP Layer
User Identity Layer
Endpoint/API Layer
Tenant/Subscription Layer
Global Platform Layer

This ensures fairness, protects critical resources, prevents denial-of-service issues, and aligns system usage with business plans (free users vs enterprise).

The goal is to design an architecture where limits are configurable, enforced with low latency, and flexible enough to support:

Sliding window
Token bucket
Fixed window
Burst handling
Grace periods
Dynamic throttling based on load

This article describes a production-focused implementation using:

SQL for configuration
Redis for fast counters
.NET API enforcement
Angular frontend feedback and retry UI

Architecture Overview

 ┌───────────────────────────────┐
 │           Client              │
 └───────┬────────┬─────────────┘
         │        │
         ▼        ▼
 ┌──────────────┐ ┌─────────────────┐
 │ Angular UI   │ │ Retry/Backoff UI│
 └───────┬──────┘ └─────────────────┘
         │
         ▼
 ┌──────────────────────┐
 │ API Gateway / Filter │
 └───────────┬──────────┘
             │
             ▼
 ┌─────────────────────────────────────┐
 │ Rate Limit Middleware (Tiered Rules)│
 └──────┬─────────────┬───────────────┘
        │             │
        ▼             ▼
 ┌─────────────┐  ┌─────────────────────┐
 │ Redis Cache │  │ SQL Metadata Store  │
 └──────┬──────┘  └─────────────┬───────┘
        │                        │
        └───────────┬────────────┘
                    ▼
             ┌──────────────┐
             │ API Handler  │
             └──────────────┘

Strategy Layers

1. IP-Based Limits

Purpose: Block bots, scrapers, unknown traffic.

Examples:

100 requests/minute per IP
Stricter rule for anonymous traffic

2. User-Level Limits

Applied after login.

Examples:

Free user: 500 requests/day
Paid license: unlimited except heavy API endpoints

3. API Endpoint-Level Limits

Some operations are costlier than others.

Examples:

Endpoint	Limit
`/auth/login`	5/minute
`/search/global`	25/minute
`/download/report`	3/hour

4. Tenant-Level Limits

Multi-tenant applications need business controls.

Example:

SaaS plan limits calls per tenant per day

5. System-Level Safety Throttle

When traffic spike occurs, limits tighten dynamically.

Metadata Model (SQL)

CREATE TABLE RateLimitPolicy (
    PolicyId UNIQUEIDENTIFIER PRIMARY KEY,
    Scope NVARCHAR(50),  -- IP, USER, API, TENANT, GLOBAL
    Target NVARCHAR(200), -- endpoint or wildcard
    LimitCount INT,
    WindowSeconds INT,
    BurstAllowed BIT DEFAULT 0
);

Example entries

Scope	Target	Limit	Window (s)
IP	*	100/min	60
USER	/search	30/min	60
API	/download/report	3/hour	3600
TENANT	*	10k/day	86400
GLOBAL	*	1M/hour	3600

Redis Counter Model

Redis stores counters temporarily for speed:

Key format:

ratelimit:{scope}:{target}:{identifier}:{timeWindowBucket}

Example:

ratelimit:USER:/search:user-34:2025-11-21-18:30

Value increments with each request and expires automatically.

Enforcement Pipeline (.NET)

Middleware checks with decreasing priority:

Global → Tenant → API Endpoint → User → IP

First violating rule blocks the request.

Sample Middleware Snippet

public async Task Invoke(HttpContext context)
{
    var rules = await _policyStore.GetRules(context);

    foreach (var rule in rules)
    {
        var result = await _limiter.Check(rule, context);

        if (!result.Allowed)
        {
            context.Response.StatusCode = StatusCodes.Status429TooManyRequests;
            context.Response.Headers["Retry-After"] = result.RetryAfterSeconds.ToString();
            await context.Response.WriteAsync("Rate limit exceeded.");
            return;
        }
    }

    await _next(context);
}

Angular Handling

Detect HTTP 429

this.http.get(url).subscribe({
  next: data => this.handle(data),
  error: err => {
    if (err.status === 429) {
      this.showRetryMessage(err.headers.get('Retry-After'));
    }
  }
});

User UI Behavior

Show countdown timer
Auto-retry for background calls
Backoff delay mechanism

Example:

"You have reached the limit. Try again in 38 seconds."

Burst Mode Allowance

Burst allows short high-volume usage but smooths long-term usage.

Technique: Token Bucket Algorithm

Redis keys:

bucket:{user}:{endpoint}

Tokens refill every cycle.

Sliding vs Fixed Window

Mode	Benefit
Fixed Window	Simpler, predictable
Sliding Window	Fairer but heavier compute
Token Bucket	Smooth traffic, burst friendly

Most production systems combine:

Fixed window for security
Token bucket for user experience

System Observation Metrics

Track:

Requests allowed
Requests blocked
Rules tripping most
Spikes and anomalies

Stored in:

SQL (history)
ElasticSearch (queries)
Prometheus/Grafana (dashboards)

Audit Log Entry Example

{"timestamp": "2025-11-21T18:52:11Z","rule": "USER-/search","identifier": "user-34","blocked": true,"limit": 30,"windowSeconds": 60}

Performance Considerations

Redis Lua scripts should be used for atomic increments
Prevent key explosion by normalizing identifiers
Add circuit breakers for Redis outages (fallback rules)

Fallback strategy:

If Redis Down → Temporary Strict System Safety Limit

Real-World Enhancements

Feature	Description
Dynamic Throttling	Limits tighten when CPU or DB load increases
Plan-Aware Limits	Different rules based on subscription
Whitelisting	Partner systems bypass limits
Application-Level Retry	UI gracefully handles throttling

Example Use Case

A SaaS analytics platform has:

Thousands of tenants
Millions of queries
Expensive search endpoints

Without rate limiting:

Few tenants could overload compute resources
API costs rise
System availability suffers

With multi-layered enforcement:

Each tenant has fair usage
Critical endpoints are protected
System remains predictable during spikes

Summary

A single rate limiting rule is not enough for enterprise systems.
A layered strategy allows fine-grained control, aligned to business rules and system stability.

A mature system applies:

Per-IP protection against public noise
Per-user fairness
Per-endpoint safety
Per-tenant business rules
Global throttling as a safety shield

With Redis for counters, SQL for metadata, .NET for enforcement, and Angular for user experience, the solution becomes scalable, configurable, and secure.