When to Use Sliding Log Over Token Bucket
Selecting a rate-limiting algorithm requires balancing mathematical precision against infrastructure constraints. For platform teams and backend engineers, the decision between a sliding log and a token bucket hinges on whether exact request accounting or burst-tolerant throughput is the primary architectural driver.
Architectural Decision Matrix: Precision vs. Throughput
Before selecting a throttling strategy, engineers must evaluate whether strict compliance or high-throughput performance is the primary constraint. Foundational trade-offs are detailed in Core Rate Limiting Algorithms & Theory, which outlines why fixed-window approximations fail under strict SLA requirements.
The sliding log algorithm maintains a precise, timestamped record of every request within a rolling window. Unlike token buckets, which approximate consumption through a leaky bucket model and permit controlled bursts, sliding logs enforce strict, continuous boundaries. This architectural choice is mandated when:
- Exact request counting is required for billing or compliance: Financial reconciliation and regulatory audit trails demand deterministic, non-approximate request tallies.
- Rolling window boundaries must eliminate edge-case spikes: Token buckets and fixed windows inherently allow boundary doubling (e.g., 2x requests at window edges). Sliding logs mathematically eliminate this by evaluating a continuous time horizon.
- Infrastructure budget supports O(N) memory scaling: Each request consumes a discrete entry in the backing store. If your platform can absorb linear memory growth in exchange for absolute accuracy, the sliding log is the correct primitive.
Exact Configuration & Implementation Patterns
Production-grade sliding log counters require atomic execution to prevent race conditions and ensure accurate eviction. The following Redis/Lua implementation provides a deterministic, single-pass evaluation pattern.
Storage & Key Architecture
- Storage Backend: Redis Sorted Set (
ZADD/ZREMRANGEBYSCORE) - Key Schema:
rl:sliding:{tenant_id}:{route_hash} - Member Uniqueness:
timestamp:random_suffix(prevents collision on concurrent requests) - Cleanup Strategy: Lazy eviction on read (
ZREMRANGEBYSCOREexecuted during quota check)
-- Atomic sliding log evaluation
local limit = tonumber(ARGV[1])
local window = tonumber(ARGV[2])
local now = tonumber(ARGV[3])
-- Evict expired entries outside the rolling window
redis.call('ZREMRANGEBYSCORE', KEYS[1], 0, now - window)
-- Count active requests in the current window
local count = redis.call('ZCARD', KEYS[1])
-- Enforce limit
if count >= limit then
return 0
end
-- Log new request with unique member
redis.call('ZADD', KEYS[1], now, now .. ':' .. math.random(100000))
return 1
Configuration Parameters:
| Parameter | Value | Purpose |
|---|---|---|
window_ms |
60000 |
Rolling evaluation window (1 minute) |
max_requests |
100 |
Hard quota threshold per window |
member_uniqueness |
timestamp:random_suffix |
Guarantees sorted set cardinality matches request count |
cleanup_strategy |
lazy_eviction_on_read |
Defers memory reclamation to quota evaluation cycles |
The sorted set approach guarantees exact cardinality within the rolling window, though memory overhead scales linearly. Implementation benchmarks and memory optimization patterns are documented in Sliding Log Counters.
Failure Mode Analysis & Mitigation
| Failure Mode | Trigger | Impact | Mitigation |
|---|---|---|---|
| Memory Exhaustion | High-traffic endpoints with extended rolling windows | Redis OOM eviction bypasses rate limiting entirely | Enforce strict maxmemory-policy, implement shard-by-tenant routing, or cap sorted set cardinality with hard TTL fallback |
| Clock Skew & Distributed Desync | Multi-region deployments without synchronized NTP | Inaccurate window boundaries cause false 429s or quota leaks | Always derive timestamps via Redis TIME command; never trust application server clocks |
| P99 Latency Amplification | Synchronous ZCARD + ZADD execution per request |
Increased tail latency under burst traffic | Pipeline commands, implement async batch logging, or deploy read-replicas for quota checks |
Validation & Production Rollout Protocol
Migrating from token bucket to sliding log requires rigorous shadow testing to verify quota accuracy without disrupting live traffic. Follow this phased rollout methodology:
- Deploy in Passive Mode: Run the sliding log counter alongside the existing token bucket implementation. Do not enforce
429responses; only log evaluation results. - Log Discrepancies: Compare exact sliding log counts against token bucket approximations. Track divergence rates during peak traffic and boundary transitions.
- Validate Latency Thresholds: Ensure the
lua_script_execution_time_p99delta remains under 5ms. If latency exceeds thresholds, optimize the Lua script or scale Redis horizontally. - Enable Active Enforcement: Switch to blocking mode only after a 7-day stability window confirms zero false positives and acceptable memory growth curves.
Critical Monitoring Metrics
sorted_set_cardinality_growth_rate: Tracks linear memory consumption per tenantlua_script_execution_time_p99: Ensures atomic evaluation does not degrade API response timesfalse_positive_429_ratio: Validates exact quota enforcement accuracyredis_memory_fragmentation_ratio: Detects underlying allocator inefficiencies during high-churn eviction cycles
Key Takeaways
- Sliding log is strictly superior for compliance, billing, and exact quota enforcement. It eliminates boundary artifacts and provides deterministic request accounting.
- Token bucket remains optimal for burst-tolerant, memory-constrained edge proxies. Use it when throughput and low-latency approximation outweigh strict counting requirements.
- Always implement circuit breakers to degrade gracefully to token bucket on Redis failure. Maintain fallback routing to prevent complete API unavailability during storage outages.