Cache Strategies: Cross-Chapter Comparison Reference
comparisons cache redis memcached
Type: Cross-chapter reference — NOT a chapter notes file
Covers: Vol 1 Ch01, Ch04, Ch06, Ch08, Ch11, Ch14, Ch15 · Vol 2 Ch05, Ch10, Ch13
Last Updated: 2026-04-13
1. Cache Placement Strategies
Where you place a cache determines latency, control, and consistency trade-offs.
| Placement | Location | Latency | Who controls TTL | Typical use |
|---|---|---|---|---|
| Client-side | Browser, mobile app memory | Sub-ms | Client | Static assets, user prefs |
| CDN / Edge | ISP-adjacent PoPs worldwide | ~5–20ms from user | CDN + origin headers | Images, JS/CSS, video segments |
| Load balancer | Reverse proxy (NGINX, HAProxy) | ~1ms | Proxy config | Session affinity, small static pages |
| App-level cache | In-process memory (guava, caffeine) | Sub-ms | Application code | Hot config data, per-JVM cache |
| Distributed cache | Redis / Memcached cluster | ~1ms network | Application code | Session data, DB query results, counters |
| Database query cache | DB engine buffer pool (InnoDB buffer) | Transparent | DB engine | Hot table rows, index pages |
| Object storage cache | CDN in front of S3 | ~10–50ms | CDN + origin | Video chunks (YouTube), file downloads |
Rule of thumb: Cache as close to the user as possible. Move further in only when you need write consistency or shared state across servers.
2. Cache Eviction Policies
When the cache is full, one of these decides what gets removed.
| Policy | Full name | Best for | Worst for | Memory overhead | Complexity |
|---|---|---|---|---|---|
| LRU | Least Recently Used | General purpose; temporal locality | Scan-heavy workloads (evicts hot data on sequential scan) | Low (doubly-linked list + hashmap) | Low |
| LFU | Least Frequently Used | Stable hot-set (recommendations, leaderboard) | New entries starved out early; bursty access patterns | Medium (frequency counter per key) | Medium |
| FIFO | First In First Out | Simple queue semantics | No regard for hotness | Very low | Very low |
| Random | Evict a random entry | Approximating LRU cheaply | Unpredictable eviction of hot data | Negligible | Very low |
| TTL-based | Expire after N seconds | Time-sensitive data (sessions, tokens, rate limit windows) | Long-lived data forced to re-fetch unnecessarily | Minimal | Low |
| SLRU | Segmented LRU | Mixed hot/warm workloads (DB buffer pools) | Added implementation complexity | Low | Medium |
Redis default: LRU (configurable per policy via maxmemory-policy). Common choices: allkeys-lru, volatile-lru, allkeys-lfu.
SDI relevance:
- Gaming leaderboard (Vol 2 Ch10): LFU or sorted set with explicit TTL keeps hot scores
- Rate limiter (Vol 1 Ch04): TTL-based expiry of counters per window
- URL shortener (Vol 1 Ch08): LRU — recent redirects are hottest
3. Cache Invalidation Strategies
The hardest problem in caching is knowing when cached data is stale.
| Strategy | Write path | Read path | Consistency | Write latency | Risk | When to use |
|---|---|---|---|---|---|---|
| TTL expiry | Write to DB only | Read cache; miss → DB → cache | Eventual (stale until TTL) | Low | Serves stale data up to TTL | Session tokens, rate-limit windows, feed previews |
| Cache-aside (lazy loading) | Write to DB; optionally delete cache key | Check cache; miss → DB → populate cache | Eventual | Low | Thundering herd on popular miss | Most general-purpose reads; URL shortener, news feed |
| Write-through | Write to cache AND DB synchronously | Always hit cache | Strong | Higher (2 writes) | Wasted writes for rarely-read data | User profile, session store — data read soon after write |
| Write-around | Write directly to DB, bypass cache | Read cache; miss → DB → populate cache | Eventual | Low | Cold reads after every write | Large file metadata, infrequently-read blobs |
| Write-back (write-behind) | Write to cache; async flush to DB | Read from cache | Eventual | Very low | Data loss if cache crashes before flush | High-throughput counters, view counts, like counts |
Pros/cons at a glance:
- Write-through: No stale reads, but wastes writes on cold keys
- Cache-aside: Only cache what’s needed, but first request always slow
- Write-back: Fastest writes, but durability risk — never use for financial data
- Write-around: Best when writes are rarely re-read (video upload metadata)
4. Distributed Cache Challenges
Cache Stampede (Thundering Herd)
Problem: Popular cache key expires → thousands of requests simultaneously miss and all query the DB.
Solutions:
- Mutex / distributed lock: First request acquires lock, fetches from DB, populates cache. Others wait.
- Probabilistic early expiration: Randomly refresh cache slightly before TTL expires based on request rate.
- Background refresh: Background job pre-warms cache before TTL expires (News Feed feed rebuilding).
- Stale-while-revalidate: Serve stale data while async refresh happens.
SDI example: News feed (Vol 1 Ch11) — celebrity posts expiring could cause stampede; solved with pre-computation and background refresh.
Cache Penetration (Null Value Attacks)
Problem: Requests for keys that don’t exist in DB (e.g., invalid user IDs) always miss cache and hit DB.
Solutions:
- Cache null values: Store
nullor sentinel in cache with short TTL (e.g., 30 sec). - Bloom filter: Before hitting cache, check a bloom filter. If key definitely not in DB → reject immediately.
SDI example: URL shortener (Vol 1 Ch08) — invalid short codes shouldn’t fan through to MySQL.
Cache Avalanche (Mass Expiry)
Problem: Many cache keys expire at the same time → sudden DB spike.
Solutions:
- Stagger TTLs: Add random jitter to TTL (
TTL = base_ttl + random(0, 300)) to spread expiry. - Warm cache on startup: Pre-load hot keys before traffic hits.
- Circuit breaker on DB: If DB is overwhelmed, return degraded response rather than cascading failure.
SDI example: Metrics monitoring (Vol 2 Ch05) — downsampled aggregates written with staggered TTLs to avoid mass expiry at the same minute boundary.
Cache Hotspot (Hot Key Problem)
Problem: One key gets disproportionate traffic (celebrity, viral post, trending item) → single Redis node bottleneck.
Solutions:
- Local in-process cache: Cache the hot key in each app server’s memory (short TTL, 1–5 sec).
- Key replication: Replicate hot key to multiple Redis slots (
hot_key:shard_0,hot_key:shard_1), read from random shard. - Read replicas: Route reads to Redis replicas, not just primary.
SDI examples:
- News feed (Vol 1 Ch11): Celebrity users with millions of followers create hot keys in fanout-on-write; hybrid model mitigates this.
- Gaming leaderboard (Vol 2 Ch10): Top-ranked players’ scores accessed constantly.
- Ad click aggregation (Vol 2 Ch06): Top-100 ads endpoint → Redis sorted set hotspot.
5. Which SDI Chapter Uses What Cache Strategy
| Chapter | Cache technology | What is cached | Strategy | Why |
|---|---|---|---|---|
| Vol 1 Ch04 — Rate Limiter | Redis | Request counters per user/IP | TTL expiry, atomic INCR | Sub-ms counter ops; TTL auto-resets windows |
| Vol 1 Ch06 — Key-Value Store | Redis / custom | The store itself is a cache | Write-through or write-back | Core topic — Dynamo-style design |
| Vol 1 Ch08 — URL Shortener | Redis | shortCode → longURL mapping | Cache-aside + TTL | Read-heavy redirects; 80% hit rate expected |
| Vol 1 Ch10 — Notification System | Redis | User device tokens, preferences | Cache-aside | Avoid DB on every notification fan-out |
| Vol 1 Ch11 — News Feed | Redis | Pre-built timelines per user | Write-through on fanout | Feed must load < 200ms; can’t query DB on read |
| Vol 1 Ch12 — Chat System | Redis | Online presence, last-seen | TTL (heartbeat refresh) | Presence expires naturally if heartbeat stops |
| Vol 1 Ch13 — Search Autocomplete | Redis / Trie cache | Top-K suggestions per prefix | Write-through (batch) | Autocomplete must respond < 100ms |
| Vol 1 Ch14 — YouTube | CDN + Redis | Video segments (CDN); video metadata (Redis) | Pull CDN; Cache-aside | Serve video from edge; metadata on API calls |
| Vol 1 Ch15 — Google Drive | Redis | File metadata, chunk upload state | Cache-aside | Metadata reads vastly outnumber writes |
| Vol 2 Ch05 — Metrics Monitoring | Redis | Recent metric aggregates | Write-through (streaming results) | Fast dashboard queries against recent data |
| Vol 2 Ch06 — Ad Click Aggregation | Redis | Real-time click counts (sorted sets) | Write-back to batch layer | High-frequency updates to counts |
| Vol 2 Ch10 — Gaming Leaderboard | Redis Sorted Set | Full leaderboard | Write-through on score update | Sorted set ZADD/ZRANK are O(log N) |
| Vol 2 Ch13 — Stock Exchange | Redis | Order book snapshots, market data | Write-through (microsecond) | Matching engine state must be instantly readable |
6. Key Numbers
Redis Performance
| Metric | Value |
|---|---|
| Throughput (single node) | ~100K–1M ops/sec (simple GET/SET) |
| Read latency (local) | < 1ms (sub-millisecond) |
| Write latency (local) | < 1ms |
| Max memory per node | ~100–300 GB typical deployment |
| Persistence overhead | ~10–30% write slowdown with AOF |
| Cluster max nodes | 1000 nodes (official recommendation) |
Redis vs Memcached
| Feature | Redis | Memcached |
|---|---|---|
| Data structures | Strings, Lists, Sets, Sorted Sets, Hashes, Streams, HyperLogLog | String (key-value) only |
| Persistence | Yes (RDB snapshots + AOF log) | No (memory only) |
| Replication | Master-replica + Sentinel | No built-in replication |
| Cluster mode | Yes (Redis Cluster with hash slots) | Yes (client-side sharding) |
| Pub/Sub | Yes | No |
| Atomic operations | Yes (INCR, LPUSH, ZADD are atomic) | Yes (CAS) |
| Multithreading | Single-threaded (I/O multi-threaded in 6.x) | Multithreaded |
| Memory efficiency | ~2–3x overhead vs raw data | Slightly more memory efficient |
| Typical use in SDI | Sessions, counters, leaderboards, timelines, pub/sub | Simple key-value cache at extreme throughput |
Choose Redis when you need data structures (sorted sets for leaderboard, lists for queues, pub/sub for real-time).
Choose Memcached when you need pure key-value at maximum throughput with minimum overhead and zero persistence needed.
7. Interview Decision Framework
“Which cache strategy should I choose?”
START
│
▼
Is the data written frequently AND re-read immediately?
├── Yes → Write-through (keep cache warm on every write)
└── No
│
▼
Is the data written frequently but rarely re-read right after?
├── Yes → Write-around (bypass cache on write; lazy load on read)
└── No
│
▼
Is ultra-low write latency critical (e.g., counters, view counts)?
├── Yes → Write-back (async flush) — ONLY if data loss is tolerable
└── No → Cache-aside (lazy loading) — DEFAULT CHOICE
│
▼
Add TTL to all keys to prevent stale data buildup
Add jitter to TTL to prevent cache avalanche
Consider null-value caching to prevent penetration
Quick rules for SDI interviews:
- Default to cache-aside with TTL — works for 80% of cases
- Use write-through when cache miss = visible latency (news feed, leaderboard)
- Use write-back only for high-frequency low-stakes writes (view counts, click counts)
- Always ask: “What happens on cache miss?” and “What happens when cache is down?”
- For distributed systems: mention thundering herd protection (mutex or stagger TTL)
- Redis sorted set = leaderboard; Redis counter + TTL = rate limiting; Redis list = message queue
See also:
- key-patterns > 1. Caching Strategies — cache invalidation and write strategies
- distributed-system-components > 14. Cache — Redis vs Memcached, eviction policies
- ch04-rate-limiter — Redis counter pattern
- ch11-news-feed — Timeline cache and fanout
- ch10-gaming-leaderboard — Redis sorted set as cache + data store