Sliding Log Memory Footprint Tuning

Q: What maxmemory-policy should a limiter Redis use?

Use noeviction or volatile-ttl, never allkeys-lru. Under allkeys-lru Redis can evict a live limiter key during memory pressure, resetting that client's quota to zero and removing the limit. Failing loudly on a full instance is safer than silently un-limiting traffic.

A sliding log buys you exact request accounting, and it bills you for that exactness in memory: every request inside the window is a discrete entry you must store, sort, and prune. This guide sits under the Sliding Log Counters parent topic and gives you the per-key math, the real Redis sorted-set overhead, and the levers — log-length caps, approximate alternatives, and TTL eviction — that keep a log affordable at scale. The audience is platform engineers sizing Redis for a billing-critical limiter who need numbers, not hand-waving.

The problem in concrete numbers

Say you enforce 1,000 requests/minute per API key with a sliding log, and at peak you have 50,000 active keys each near their limit. Naively you might budget 50,000 × 1,000 × 8 bytes ≈ 400 MB for the timestamps. The real figure is several times higher, because a Redis sorted set (ZSET) is not a packed array of 8-byte doubles — it is a hash table plus a skiplist, and each member carries the timestamp string, a double score, and skiplist node pointers. Under-budget this and Redis hits maxmemory, your eviction policy fires, and a silently evicted key stops rate-limiting entirely. The diagram below breaks down where the bytes actually go.

Per-key memory math

Start from the entry count, then multiply by realistic per-member overhead.

Entries per key is bounded by the limit, not the window: a 1,000/min limit stores at most ~1,000 live members per key regardless of how the minute is spread.
Bytes per member on a 64-bit Redis with the skiplist encoding is typically 64–100 bytes, once you add the member string (a timestamp:random string to keep members unique), the score, the dictionary entry, and skiplist pointers. Small sets use the compact listpack encoding and cost far less, but a 1,000-entry set is well past zset-max-listpack-entries (default 128) and will be a full skiplist.
Per-key fixed cost adds the key name, the top-level ZSET object, and expiry metadata — order of a few hundred bytes, amortized away once the set is large.

So the working estimate is active_keys × entries_per_key × ~85 bytes. For the example above that is ≈ 4 GB, an order of magnitude over the naive doubles-only guess. Always validate against your own build with MEMORY USAGE rl:sliding:{key} on a representative key rather than trusting the formula blind.

Decision table: memory vs accuracy

Option	Per-key memory	Accuracy	When to use
Full sliding log	O(limit), ~85 B/req	Exact, audit-grade	Billing, compliance, hard quotas
Capped log (ZREMRANGEBYRANK)	O(cap), bounded	Exact up to the cap; over-cap requests under-counted	High limits where only recent N events matter
Sliding-window approximation	O(1), 2 counters	±(prev-window weighting error)	Large limits, millions of keys, soft SLAs
Token bucket	O(1), 2 fields	Burst-tolerant, no per-event record	Traffic shaping where exact counts are not required
Probabilistic (e.g. sampled log)	O(1/sample-rate)	Statistical, not exact	Telemetry-grade limits, abuse heuristics

The full log is the only row that survives an audit; everything below it trades exactness for a flat memory profile. Most platforms run the log only on billing-critical routes and an approximation everywhere else.

Step-by-step: bound the footprint

Implement the controls in order; each one caps a different growth axis.

Measure real per-member cost with MEMORY USAGE Measure real per-member cost with `MEMORY USAGE` on a key filled to the limit, then size capacity from the measured number, not the formula.
Prune on every evaluation with ZREMRANGEBYSCORE key -inf (now-window) Prune on every evaluation with `ZREMRANGEBYSCORE key -inf (now-window)` so expired members never accumulate between requests.
Cap log length with ZREMRANGEBYRANK key 0 -(cap+1) Cap log length with `ZREMRANGEBYRANK key 0 -(cap+1)` to hard-bound entries per key when the limit is very high.
Set a PEXPIRE of window + slack Set a `PEXPIRE` of `window + slack` on every write so abandoned keys self-evict and never linger past one window.
Configure maxmemory-policy to noeviction (or volatile-ttl Configure `maxmemory-policy` to `noeviction` (or `volatile-ttl`) so a memory ceiling fails loudly instead of silently dropping live limiter keys.
Alert on used_memory Alert on `used_memory` and per-key cardinality so growth is visible before it reaches the ceiling.

-- Capped, self-pruning sliding log. KEYS[1]=key.
-- ARGV: now(ms), window(ms), limit, cap (max stored members), slack(ms)
local key    = KEYS[1]
local now    = tonumber(ARGV[1])
local window = tonumber(ARGV[2])
local limit  = tonumber(ARGV[3])
local cap    = tonumber(ARGV[4])
local slack  = tonumber(ARGV[5])

-- 1) Drop everything older than the rolling horizon (bounds memory by recency).
redis.call('ZREMRANGEBYSCORE', key, '-inf', '(' .. (now - window))

-- 2) Hard cap: keep only the most recent `cap` members (bounds memory by count).
--    Negative ranks index from the end; remove the oldest beyond the cap.
local size = redis.call('ZCARD', key)
if size > cap then
  redis.call('ZREMRANGEBYRANK', key, 0, size - cap - 1)
end

-- 3) Count and decide. Cardinality now reflects live, capped entries.
local count = redis.call('ZCARD', key)
if count >= limit then
  return { 0, count }                       -- denied
end

-- 4) Log this request with a unique member, then refresh TTL so idle keys die.
redis.call('ZADD', key, now, now .. ':' .. math.random(1000000))
redis.call('PEXPIRE', key, window + slack)
return { 1, count + 1 }

The cap and the TTL are the two load-bearing lines: the cap bounds a single hot key from ballooning past cap members even if a client floods it, and the PEXPIRE guarantees that the tens of thousands of one-shot keys created by scanners are reclaimed within one window rather than living until a manual sweep.

Approximate alternatives when O(N) is too expensive

When the full log will not fit, two substitutions keep exactness where it matters and shed it where it does not:

Sliding-window approximation — replace the log with two fixed-window counters and weight the previous window by its remaining overlap, as covered in implementing sliding window in memory. This is O(1) per key and is accurate to within a few percent for steady traffic, the standard choice for high-cardinality, non-billing limits.
Capped log — keep an exact log but cap stored members at, say, 1.2 × limit. You retain exact enforcement at and just above the limit (the only region that matters for a 429 decision) while refusing to store an unbounded tail during a flood.

Reserve the uncapped log for routes where every request must be reconcilable; for sub-millisecond decisioning under multi-node load, fold in the consistency rules from distributed algorithm sync so capped pruning stays deterministic across Redis nodes.

Gotchas & edge cases

maxmemory-policy allkeys-lru silently disables limits. If Redis evicts a hot limiter key under memory pressure, that key’s quota resets to zero and the client is briefly unlimited. Use noeviction or volatile-ttl for limiter keyspaces and alert on evictions.
Capping changes semantics during a flood. Once you cap entries, a client sending far above the limit is under-counted past the cap. That is fine for a deny decision (already at limit) but means the log is no longer a complete audit trail above the cap — do not cap a billing log.
Unique-member randomness inflates the string. now .. ':' .. random keeps members unique but lengthens the member string; keep the suffix short (a small random int) to avoid paying extra bytes on every entry.
ZCARD after pruning, not before. Counting before ZREMRANGEBYSCORE over-counts expired members and produces false 429s; always prune first within the same atomic script.
TTL slack must exceed clock skew. A PEXPIRE of exactly window can evict a key whose newest member is still live if node clocks disagree; add a slack margin (10% of window) so live keys never expire early.

Verification & testing

Drive a key to its limit, then inspect the real footprint and confirm the cap holds.

# Fill one key to ~1000 members, then measure its true memory and cardinality.
KEY="rl:sliding:acct_42:search"
for i in $(seq 1 1000); do
  redis-cli ZADD "$KEY" "$(date +%s%3N)" "$i:$RANDOM" >/dev/null
done
redis-cli MEMORY USAGE "$KEY"   # bytes for this single key — multiply by active keys
redis-cli ZCARD "$KEY"          # confirm the cap bounds this at your configured cap

Multiply MEMORY USAGE by your projected active-key count, compare against the Redis maxmemory headroom, and watch redis_memory_used_bytes plus per-key cardinality under load; see Prometheus metrics for rate limiting for wiring those gauges into alerts.

Frequently Asked Questions

Why is my sliding log using far more memory than limit × 8 bytes?

Because a Redis sorted set is a hash table plus a skiplist, not a packed array of doubles. Each member carries a member string, an 8-byte score, a dictionary entry, and skiplist node pointers — typically 64–100 bytes per request on a 64-bit build. Measure with MEMORY USAGE rather than assuming the doubles-only figure.

Does capping the log break exact rate limiting?

Not for the deny decision. If you cap at or just above the limit, the cardinality is still exact in the region that decides a 429. What you lose is the complete record above the cap, so do not cap a log you rely on for billing reconciliation — only cap when the log is purely an enforcement primitive.

What maxmemory-policy should a limiter Redis use?

Use noeviction or volatile-ttl, never allkeys-lru. Under allkeys-lru Redis can evict a live limiter key during memory pressure, resetting that client's quota to zero and removing the limit entirely. Failing loudly on a full instance is safer than silently un-limiting traffic.

When should I drop the log for the sliding-window approximation?

When you have high per-key limits times high key cardinality and the route is not billing- or compliance-critical. The two-counter weighted approximation is O(1) per key and accurate to within a few percent for steady traffic, so it scales to millions of keys at a flat memory cost the log cannot match.

Sliding Log Counters — the parent topic with the full log implementation and Redis patterns.
When to Use Sliding Log Over Token Bucket — the accuracy-versus-throughput decision this tuning supports.
Implementing Sliding Window in Memory — the O(1) weighted approximation that replaces the log at scale.
Distributed Algorithm Sync — keeping pruning deterministic across Redis nodes.