Introduction
Many teams running Rust services on Kubernetes face a frustrating situation: monitoring dashboards show memory usage looks fine, yet the pod suddenly gets OOMKilled. This often happens without any clear warning and leads to confusion, restarts, and production incidents.
In simple terms, Kubernetes and Rust view memory very differently. Rust focuses on performance and memory reuse, while Kubernetes enforces hard memory limits. When these two perspectives meet, an app can be killed even though it appears healthy. This article explains why this happens, using real-world examples, simple language, and practical guidance.
What Developers Usually See in Production
Before diving into causes, here is what teams commonly report:
This leads to a common reaction: “But memory was under the limit.”
Wrong Assumption vs Reality
Wrong assumption: The app exceeded the memory limit only when dashboards show it exceeding the limit.
Reality: Kubernetes kills the container the moment it exceeds the limit, even briefly, and metrics often lag behind real usage.
Think of it like a speed camera: you may see your average speed is fine, but a short spike still triggers a ticket.
How Kubernetes Decides to OOMKill a Pod
Kubernetes enforces memory limits using Linux cgroups.
Key points:
Memory limits are strict, not flexible
Even a short spike can trigger a kill
There is no garbage collection grace period
Real-world analogy:
“Your app is allowed into a room with a strict headcount. Even if someone steps in for a few seconds, security throws everyone out immediately.”
Why Rust Memory Spikes Are Hard to See
Rust allocates memory aggressively for performance and reuses it.
What actually happens:
Memory is reserved quickly during spikes
The allocator keeps memory for reuse
Kubernetes counts reserved memory as fully used
What developers see:
“Memory looks stable, but the pod keeps dying.”
The spike already happened. The metrics just did not catch it in time.
Startup Spikes Are a Common Killer
Rust apps often allocate a lot of memory during startup:
Before vs After Story
Before:
“Our Rust service never even started in Kubernetes.”
After:
“We increased the memory limit slightly and delayed cache warm-up. The service started reliably.”
Startup spikes alone can exceed limits even if steady-state memory is low.
Thread Stacks Push Apps Over the Edge
Each Rust thread allocates stack memory.
Real-world example:
“A service using 16 threads silently reserves tens of megabytes just for stacks.”
In Kubernetes, this hidden cost can push memory over the limit without obvious signs in application-level metrics.
Reserved Memory vs Used Memory Confusion
Kubernetes does not know how much memory your app is actively using. It only sees what is reserved.
Example:
“The app is using 400 MB, but Kubernetes sees 520 MB reserved and kills it.”
This makes Rust apps look worse than they are.
Why Increasing Memory Limits Often ‘Fixes’ It
Increasing limits works because:
The allocator has room to reuse memory
Startup spikes fit within limits
Short bursts no longer cause kills
But blindly increasing limits increases cloud costs and hides real issues.
When Memory Requests vs Limits Matter
If memory requests are too low:
If limits are too tight:
Real-world guidance:
“Set requests near average usage and limits with enough headroom for spikes.”
Why Rust Feels More Sensitive Than Other Languages
Languages with garbage collectors may return memory gradually or pause execution.
Rust:
This makes Rust faster, but also less forgiving in tight container limits.
How Teams Should Fix This in Practice
If You Are Running APIs
If You Are Running Batch Jobs
If You Are Using Kubernetes
Do not set limits too close to averages
Watch memory during startup, not just runtime
Correlate OOMKills with logs and startup events
Simple Mental Checklist
When a Rust pod gets OOMKilled, ask:
Did memory spike briefly?
Are thread stacks adding up?
Is the limit too close to average usage?
Are caches unbounded?
Are metrics lagging behind reality?
Usually, the answer points directly to the fix.
Summary
Rust apps get OOMKilled in Kubernetes even when memory looks fine because Kubernetes enforces strict limits, Rust allocators reserve memory aggressively, and monitoring tools lag behind real spikes. What appears as a memory leak is often a short-lived spike or reserved memory being counted as used. By allowing headroom, controlling startup behavior, understanding reserved vs used memory, and setting realistic limits, teams can run Rust services reliably without unnecessary crashes.