Introduction
Modern applications are exposed to high traffic, automation scripts, spikes, and sometimes malicious abuse.
If every request is processed equally, no matter who sends it, the system becomes unstable.
A multi-layered rate limiting architecture prevents overload by enforcing limits at:
This ensures fairness, protects critical resources, prevents denial-of-service issues, and aligns system usage with business plans (free users vs enterprise).
The goal is to design an architecture where limits are configurable, enforced with low latency, and flexible enough to support:
This article describes a production-focused implementation using:
Architecture Overview
┌───────────────────────────────┐
│ Client │
└───────┬────────┬─────────────┘
│ │
▼ ▼
┌──────────────┐ ┌─────────────────┐
│ Angular UI │ │ Retry/Backoff UI│
└───────┬──────┘ └─────────────────┘
│
▼
┌──────────────────────┐
│ API Gateway / Filter │
└───────────┬──────────┘
│
▼
┌─────────────────────────────────────┐
│ Rate Limit Middleware (Tiered Rules)│
└──────┬─────────────┬───────────────┘
│ │
▼ ▼
┌─────────────┐ ┌─────────────────────┐
│ Redis Cache │ │ SQL Metadata Store │
└──────┬──────┘ └─────────────┬───────┘
│ │
└───────────┬────────────┘
▼
┌──────────────┐
│ API Handler │
└──────────────┘
Strategy Layers
1. IP-Based Limits
Purpose: Block bots, scrapers, unknown traffic.
Examples:
2. User-Level Limits
Applied after login.
Examples:
3. API Endpoint-Level Limits
Some operations are costlier than others.
Examples:
| Endpoint | Limit |
|---|
/auth/login | 5/minute |
/search/global | 25/minute |
/download/report | 3/hour |
4. Tenant-Level Limits
Multi-tenant applications need business controls.
Example:
5. System-Level Safety Throttle
When traffic spike occurs, limits tighten dynamically.
Metadata Model (SQL)
CREATE TABLE RateLimitPolicy (
PolicyId UNIQUEIDENTIFIER PRIMARY KEY,
Scope NVARCHAR(50), -- IP, USER, API, TENANT, GLOBAL
Target NVARCHAR(200), -- endpoint or wildcard
LimitCount INT,
WindowSeconds INT,
BurstAllowed BIT DEFAULT 0
);
Example entries
| Scope | Target | Limit | Window (s) |
|---|
| IP | * | 100/min | 60 |
| USER | /search | 30/min | 60 |
| API | /download/report | 3/hour | 3600 |
| TENANT | * | 10k/day | 86400 |
| GLOBAL | * | 1M/hour | 3600 |
Redis Counter Model
Redis stores counters temporarily for speed:
Key format:
ratelimit:{scope}:{target}:{identifier}:{timeWindowBucket}
Example:
ratelimit:USER:/search:user-34:2025-11-21-18:30
Value increments with each request and expires automatically.
Enforcement Pipeline (.NET)
Middleware checks with decreasing priority:
Global → Tenant → API Endpoint → User → IP
First violating rule blocks the request.
Sample Middleware Snippet
public async Task Invoke(HttpContext context)
{
var rules = await _policyStore.GetRules(context);
foreach (var rule in rules)
{
var result = await _limiter.Check(rule, context);
if (!result.Allowed)
{
context.Response.StatusCode = StatusCodes.Status429TooManyRequests;
context.Response.Headers["Retry-After"] = result.RetryAfterSeconds.ToString();
await context.Response.WriteAsync("Rate limit exceeded.");
return;
}
}
await _next(context);
}
Angular Handling
Detect HTTP 429
this.http.get(url).subscribe({
next: data => this.handle(data),
error: err => {
if (err.status === 429) {
this.showRetryMessage(err.headers.get('Retry-After'));
}
}
});
User UI Behavior
Example:
"You have reached the limit. Try again in 38 seconds."
Burst Mode Allowance
Burst allows short high-volume usage but smooths long-term usage.
Technique: Token Bucket Algorithm
Redis keys:
bucket:{user}:{endpoint}
Tokens refill every cycle.
Sliding vs Fixed Window
| Mode | Benefit |
|---|
| Fixed Window | Simpler, predictable |
| Sliding Window | Fairer but heavier compute |
| Token Bucket | Smooth traffic, burst friendly |
Most production systems combine:
System Observation Metrics
Track:
Requests allowed
Requests blocked
Rules tripping most
Spikes and anomalies
Stored in:
Audit Log Entry Example
{"timestamp": "2025-11-21T18:52:11Z","rule": "USER-/search","identifier": "user-34","blocked": true,"limit": 30,"windowSeconds": 60}
Performance Considerations
Redis Lua scripts should be used for atomic increments
Prevent key explosion by normalizing identifiers
Add circuit breakers for Redis outages (fallback rules)
Fallback strategy:
If Redis Down → Temporary Strict System Safety Limit
Real-World Enhancements
| Feature | Description |
|---|
| Dynamic Throttling | Limits tighten when CPU or DB load increases |
| Plan-Aware Limits | Different rules based on subscription |
| Whitelisting | Partner systems bypass limits |
| Application-Level Retry | UI gracefully handles throttling |
Example Use Case
A SaaS analytics platform has:
Without rate limiting:
With multi-layered enforcement:
Each tenant has fair usage
Critical endpoints are protected
System remains predictable during spikes
Summary
A single rate limiting rule is not enough for enterprise systems.
A layered strategy allows fine-grained control, aligned to business rules and system stability.
A mature system applies:
Per-IP protection against public noise
Per-user fairness
Per-endpoint safety
Per-tenant business rules
Global throttling as a safety shield
With Redis for counters, SQL for metadata, .NET for enforcement, and Angular for user experience, the solution becomes scalable, configurable, and secure.