Chapter 10 Flashcards - Real-time Gaming Leaderboard
flashcards volume2 leaderboard redis sorted-set real-time
What Redis data structure powers a real-time leaderboard, and why?
?
Redis Sorted Set (ZSET). Each member is a player_id; each member has an associated score (float). Redis maintains members in sorted order at all times using a skip list internally. Operations: O(log N) insert/update, O(log N) rank lookup, O(log N + K) top-K range query. Compare to SQL ORDER BY which requires a full index scan O(N) for rank lookup. Purpose-built for leaderboard-style ranked queries.
What are the four most important Redis Sorted Set commands for a leaderboard?
?
ZADD leaderboard score member → Add or update a player’s score. O(log N). ZREVRANGE leaderboard 0 9 WITHSCORES → Top 10 players (highest to lowest). O(log N + K). ZREVRANK leaderboard member → Player’s 0-indexed rank (highest score = rank 0). O(log N). ZINCRBY leaderboard delta member → Atomically increment a player’s score. O(log N). These four cover: update, top-N display, user rank lookup, and score increment.
What does ZINCRBY do and why is it preferred over ZADD for score updates?
?
ZINCRBY leaderboard:chess 100 “alice” atomically adds 100 to alice’s current score and updates her position in the sorted set. ZADD with a new absolute score requires first reading the current score (GET), adding, then writing back — a read-modify-write that risks race conditions under concurrent updates. ZINCRBY is a single atomic Redis operation. Since games award delta points (earned 100 points this round), ZINCRBY maps directly to the use case.
What is the time complexity of each Redis Sorted Set leaderboard operation?
?
ZADD (add/update): O(log N). ZINCRBY (increment): O(log N). ZREVRANGE top-K: O(log N + K) — walk the skip list to start position + scan K elements. ZREVRANK (rank lookup): O(log N) — no counting loop needed. ZSCORE (get score): O(1) — direct hash lookup. ZCARD (total count): O(1). Compare rank lookup to SQL: ZREVRANK is O(log N) vs SQL COUNT query O(N). At 5M players, O(log 5M) ≈ 23 operations vs O(5M).
Why does SQL ORDER BY fail for leaderboard rank lookups at scale?
?
Finding a player’s rank requires: SELECT COUNT(*) + 1 FROM scores WHERE game_id = X AND score > (player’s score). At 5M players, this is a full table scan through the index — O(N). Even with an index on (game_id, score), counting all rows with a higher score remains O(N) in the worst case. Redis ZREVRANK does the same in O(log N) because the sorted set skip list stores rank information structurally. At 5M players this is the difference between 23 operations vs 5 million.
What key naming convention should you use for game leaderboards in Redis?
?
Use descriptive composite keys: leaderboard:{game_id} for global leaderboard, leaderboard:{game_id}:weekly:{year_week} for weekly (e.g., leaderboard:chess:weekly:202415), leaderboard:{game_id}:monthly:{year_month} for monthly (e.g., leaderboard:chess:monthly:2024-04), leaderboard:{game_id}:region:{region_code} for regional. One sorted set per scope per game. Each gets its own ZADD/ZINCRBY writes. Score updates can fan out to multiple keys atomically using Redis pipelines.
How do you implement score tie-breaking in a Redis Sorted Set?
?
Encode a composite score as a single float: composite = score × 10^13 - timestamp_ms. Higher score → higher composite. Equal score → whoever reached it earlier (smaller timestamp) gets larger composite → ranks higher. Store composite in Redis instead of raw score. Example: Alice score=1450 at t=1000 → 14499999999000. Bob score=1450 at t=2000 → 14499999998000. ZREVRANK will return Alice above Bob. The tiebreak is deterministic and fair (first to achieve wins) with no extra fields.
How do you persist Redis Sorted Set leaderboard data?
?
AOF (Append-Only File): Every write command appended to disk log. On restart, Redis replays log. Set appendfsync everysec for at most 1 second of data loss. RDB snapshots: Periodic point-in-time snapshots. Faster restart but more data loss between snapshots. MySQL dual write: Score Processor writes to both Redis AND MySQL. MySQL = source of truth. On Redis failure, rebuild sorted set from MySQL: SELECT all scores → bulk ZADD. Production recommendation: Redis AOF for fast recovery + MySQL for full durability and history.
What is the difference between Redis AOF and RDB persistence?
?
AOF (Append-Only File): Logs every write command. On restart, replay all commands to reconstruct state. Config: appendfsync always (no loss), everysec (1 sec loss), no (OS decides). Grows large over time — needs periodic BGREWRITEAOF to compact. RDB (Redis Database Snapshot): Fork-based point-in-time dump of all data. Compact binary file. Faster restart (load snapshot vs replay log). More data loss (e.g., 5 min if snapshot every 5 min). For leaderboards: use AOF with everysec + RDB as backup. AOF = durability, RDB = fast recovery.
How do you implement historical (weekly/monthly) leaderboards?
?
Use separate Redis sorted sets per time period rather than resetting in place. On each score update: ZINCRBY leaderboard:chess delta alice AND ZINCRBY leaderboard:chess:weekly:202415 delta alice (fan out). At period end: 1) Read top-N from Redis (ZREVRANGE). 2) Archive to MySQL (weekly_archive table). 3) Set Redis EXPIRE on old key (7-day grace period for queries). 4) New period starts with a fresh key (no reset, just a new key). Never reset in-place — risks data corruption on failure mid-reset.
What happens when a weekly leaderboard period ends — walk through the transition.
?
At Sunday midnight: 1. ZREVRANGE leaderboard:chess:weekly:202415 0 999 WITHSCORES → get top-1000 final standings. 2. INSERT INTO weekly_archive (game_id, week, rank, player_id, score) VALUES … → persist to MySQL. 3. EXPIRE leaderboard:chess:weekly:202415 604800 → keep Redis key alive 7 more days for recent-history queries. 4. New key leaderboard:chess:weekly:202416 is created automatically on first ZINCRBY of the new week. Result: Clean transition, no data loss, history preserved in MySQL, short-term queries served from Redis.
How do you shard a leaderboard across multiple Redis instances for very large games?
?
Shard by hash(player_id) % N across N Redis instances. Each shard is a sorted set holding a subset of players (by ID). Score update: route to shard = hash(player_id) % N, apply ZINCRBY there. Top-10 global: Query all N shards for their top-10, then K-way merge in the application layer (min-heap, O(N log K)). User rank: Get user’s score from their shard, then ZCOUNT on each other shard for players above that score, sum = global rank. Use when single game has > 50–100M active players. 5M DAU fits easily in one sorted set.
How do you compute a global top-10 by merging results from N Redis shards?
?
K-way merge: 1. Query each of the N shards: ZREVRANGE shard_k 0 9 WITHSCORES → N lists of up to 10 players. 2. Load all results into a min-heap of size 10 (keyed by score). 3. Extract top-10 from the heap. Cost: O(N × 10 × log(10)) ≈ O(N) — very fast for small N. This is the standard merge approach for distributed sorted data. Alternative: use ZUNIONSTORE (Redis union) if all shards are on the same Redis Cluster — Redis can merge sorted sets server-side. ZUNIONSTORE destination N key1 key2 … → O(N × K log K).
How does a score update pipeline using Kafka work?
?
Game Server emits event: {game_id, player_id, delta, event_id, timestamp} → Kafka topic “score-events” (partitioned by game_id for ordering). Score Processor (Kafka consumer): 1. Check event_id for deduplication (store seen IDs in Redis/DB). 2. Validate delta and player. 3. ZINCRBY Redis sorted set. 4. UPDATE MySQL player_scores. 5. Commit Kafka offset. Benefits: Decouples game server from score storage, buffers traffic spikes (tournaments), enables replay on failure, multiple consumers (leaderboard, analytics, notifications).
Why partition Kafka score events by game_id?
?
Partitioning by game_id ensures all score events for a given game are processed in order by the same consumer. This matters for deduplication (event_id tracking per game) and for consistent ranking updates (no out-of-order score decrements from replay). It also means a single consumer handles all events for one game, making state management simpler. If one game has enormous volume, split to multiple partitions using composite key: hash(game_id + player_id). Each consumer in the consumer group owns a subset of partitions.
What does idempotency mean in the context of score updates, and how is it implemented?
?
Idempotency means processing the same score event twice has the same effect as processing it once. Without it, Kafka replay (on consumer failure) would double-count points. Implementation: Each game event has a unique event_id (UUID). Score Processor checks: IF EXISTS(SELECT 1 FROM processed_events WHERE event_id = X) THEN skip. ELSE: process (ZINCRBY + MySQL update) + INSERT INTO processed_events. Use a short TTL on processed_events (e.g., 7 days) since replay risk drops over time. Alternatively, use Redis SET NX (set-if-not-exists) for fast deduplication check.
How do you show a player their rank when they are NOT in the top-10?
?
ZREVRANK leaderboard:{game_id} {player_id} returns the 0-indexed rank of any player in the sorted set regardless of whether they are in the top-10. If Alice is ranked 42nd: ZREVRANK leaderboard:chess “alice” → 41 (0-indexed) → display rank 42. Combine with ZSCORE leaderboard:chess “alice” → get their score. Return both: { player_id: “alice”, rank: 42, score: 1450 }. This is one of the key advantages of Redis Sorted Set — O(log N) rank for ANY member, not just top members.
What scale targets should you state for the gaming leaderboard problem?
?
DAU: 5 million. Concurrent players: ~25 updates/sec at low load. Score updates: 50 million/day ≈ 580/sec average, ~2,500/sec peak. Top-N displayed: Top 10. Latency target: < 100ms for leaderboard read and rank lookup. Availability: 99.99%. Redis capacity: 5M players × ~50 bytes per sorted set member (player_id + score) ≈ 250 MB per game leaderboard — fits entirely in memory. At 580 writes/sec, a single Redis instance handles this trivially (Redis handles 100K+ ops/sec).
What is the Redis ZCOUNT command and when is it used in leaderboard design?
?
ZCOUNT key min max returns the number of members with scores between min and max. O(log N). Used for: User rank in sharded leaderboard — to count how many players have a score above the user’s score, query each shard with ZCOUNT leaderboard:chess:shardK (user_score+1) +inf, then sum across shards + ZREVRANK in user’s shard. Also useful for leaderboard analytics (how many players are in a score tier), and page count (how many pages of players are above a certain score).
How do you implement a friends-only leaderboard?
?
Query-time approach (best for small friend lists < 500): 1. Fetch user’s friend list from social graph service. 2. ZSCORE leaderboard:{game_id} {friend_id} for each friend → O(1) per friend. 3. Sort friend scores in application layer. 4. Rank = position in that sorted list. Advantage: No pre-computation, always fresh. Pre-compute approach (for large friend networks): Maintain a per-user sorted set of friend scores, updated on every friend score change. Expensive at scale — a player with 10K friends triggers 10K updates per score change. Use query-time for most cases.
What is ZREVRANGE vs ZRANGE in Redis and when do you use each?
?
ZRANGE key start stop [WITHSCORES]: Returns members from lowest to highest score (ascending). Range is by rank index (0 = lowest score). ZREVRANGE key start stop [WITHSCORES]: Returns members from highest to lowest score (descending). Range is by rank index (0 = highest score). For a leaderboard: ZREVRANGE leaderboard:chess 0 9 WITHSCORES → top 10 (rank 1 to rank 10). ZRANGE leaderboard:chess 0 9 WITHSCORES → bottom 10 (lowest scores). Always use ZREVRANGE for leaderboards. Note: Redis 6.2+ unified these into ZRANGE with REV flag.
Why is Redis Sorted Set internally implemented as a skip list (not a B-tree)?
?
A skip list provides O(log N) for insert, delete, and rank queries with simpler implementation than a balanced BST. Crucially, a skip list naturally supports rank queries (how many elements are less than X?) by storing a count of nodes below each forward pointer. Redis augments the standard skip list with per-node span counts, enabling ZREVRANK in O(log N) with no additional data structure. A B-tree does not inherently support rank queries efficiently. Hash maps are also maintained alongside the skip list for O(1) ZSCORE lookups.
How does score update work when using Redis Cluster (not single Redis)?
?
Redis Cluster shards keys by hash slot (16384 slots, hash based on key). All sorted set operations for leaderboard:chess go to the node owning that key’s hash slot. For a single game, all operations (ZADD, ZREVRANGE, ZREVRANK) hit the same node — single-key operations, no cross-slot coordination needed. Problem: one very large sorted set can’t be split across nodes automatically by Redis Cluster (a single key = one node). Solution for > 100M players: application-level sharding with multiple keys (leaderboard:chess:shard0, leaderboard:chess:shard1, …).
What are the main differences between a global leaderboard and a regional leaderboard in this design?
?
Global: One sorted set per game — leaderboard:chess. Includes all players worldwide. Top-N is the global top-N. Regional: One sorted set per game per region — leaderboard:chess:region:us-east. Score updates fan out to both global and regional keys (ZINCRBY on both, pipelined). Top-N in a region is a ZREVRANGE on the regional key. User rank within region = ZREVRANK on regional key. Cost: 2× writes (global + regional), 2× Redis memory. Benefit: low-latency regional queries, regional tournaments, reduced merge complexity.
How do you handle a player who creates a new account mid-week for historical leaderboard?
?
When a new player earns their first points, ZINCRBY on the weekly sorted set creates their entry with that delta as their starting score (ZINCRBY on a non-existent member creates it with score 0 + delta). No special initialization needed — Redis creates the entry lazily. If the player had no activity in a week, they simply have no entry in that week’s sorted set. ZREVRANK returns nil for a player not in the set — the API should handle nil as “not ranked” (player hasn’t scored this week).
What should you say first when asked “Design a Gaming Leaderboard” in an interview?
?
Immediately clarify: DAU, score update frequency, top-N count, need for user’s own rank, historical leaderboards. Then say: The core data structure is Redis Sorted Set because it gives O(log N) insert and rank lookup. Sketch: Score update → Kafka → Score Processor → ZINCRBY on Redis sorted set + write to MySQL. Read path: ZREVRANGE for top-N, ZREVRANK for user rank. Mention persistence: AOF + MySQL dual write. Then go deeper on sharding (if > 100M players), historical periods, tie-breaking. Lead with Redis Sorted Set — it’s the clear, correct answer.
Total Cards: 25
Review Time: 20-25 minutes
Priority: HIGH - Common at mid/senior level; Redis Sorted Set commands must be memorized
Last Updated: 2026-04-13