Storage Systems: Cross-Chapter Comparison Reference

comparisons storage database sql nosql

Type: Cross-chapter reference — NOT a chapter notes file
Covers: Vol 1 Ch01, Ch06, Ch08, Ch12, Ch13, Ch14, Ch15 · Vol 2 Ch05, Ch06, Ch07, Ch09, Ch10, Ch11, Ch13
Last Updated: 2026-04-13

1. Storage Type Overview

Type	Representative systems	1-line description	Best use case
Relational DB	MySQL, PostgreSQL, Oracle	Structured tables with ACID transactions and joins	User accounts, orders, financial ledgers, anything needing strong consistency + complex queries
Document DB	MongoDB, Couchbase, DynamoDB (doc mode)	Schema-flexible JSON documents, nested structures	User profiles, product catalogs, content management — flexible evolving schema
Wide-column DB	Cassandra, HBase, ScyllaDB	Rows with dynamic columns; optimized for time-series and high write throughput	Chat messages, activity feeds, time-series metrics, IoT events
Key-value store	Redis, DynamoDB (KV mode), Memcached	Pure key → value lookups, sub-millisecond latency	Sessions, caches, rate limit counters, feature flags
Time-series DB	InfluxDB, TimescaleDB, Prometheus	Optimized for timestamped data; built-in downsampling	Infrastructure metrics, IoT sensors, financial tick data
Graph DB	Neo4j, Amazon Neptune, JanusGraph	Nodes and edges; optimized for traversal queries	Social graphs, fraud detection, recommendation engines, knowledge graphs
Object storage	AWS S3, Google Cloud Storage, Azure Blob	Flat key-based storage for large blobs, accessed via HTTP	Videos, images, backups, data lake, large file downloads
Block storage	AWS EBS, GCP Persistent Disk	Raw blocks attached to VMs, like a virtual disk	OS volumes, database data files, low-latency VM storage
Search engine	Elasticsearch, OpenSearch, Solr	Inverted index for full-text search; faceting and ranking	Product search, log analytics, autocomplete, content discovery
Data warehouse	Snowflake, BigQuery, Redshift, ClickHouse	Columnar OLAP; optimized for analytical queries on large historical datasets	BI dashboards, ad hoc analytics, business reporting

2. SQL vs NoSQL Decision Guide

Choose SQL (Relational) when:

Data has clear relationships (foreign keys, joins matter)
You need ACID transactions (bank transfers, e-commerce orders, hotel reservations)
Schema is stable and well-defined upfront
Complex queries (multi-table joins, aggregations, reporting)
Data volume fits one powerful server or a few replicas
Consistency is more important than availability

Typical SDI examples: User accounts (everywhere), hotel reservations (Vol 2 Ch07), payment ledger (Vol 2 Ch11), order history (Vol 2 Ch07)

Choose NoSQL when:

Schema is flexible or evolving rapidly
You need horizontal write scalability beyond what one DB can handle
Queries are simple (key lookups, range scans by primary key)
Eventual consistency is acceptable
High write throughput is required (>10K writes/sec)
Time-series or append-only data patterns

Typical SDI examples: Chat messages (Cassandra — Vol 1 Ch12), metrics (InfluxDB/Cassandra — Vol 2 Ch05), video metadata analytics (Cassandra — Vol 1 Ch14)

Hybrid approaches (most large systems):

SQL for user accounts + metadata (strong consistency needed)
NoSQL for activity data + events (high volume, eventual consistency OK)
Redis as cache layer in front of both
Object storage for binary blobs (videos, files)

3. NoSQL Options Comparison Table

Feature	Cassandra	DynamoDB	MongoDB	Redis	HBase
Data model	Wide-column (partition key + clustering columns)	Key-value / document hybrid	JSON documents	Key-value + rich structures (lists, sets, sorted sets)	Wide-column (Hadoop-backed)
Query flexibility	Low — must design table for query patterns	Low — query only by PK or GSI	High — rich query language, aggregation pipeline	Low — key lookups + range within sorted sets	Low — row key + column family scans
Consistency model	Tunable (ONE, QUORUM, ALL)	Strong or eventual (configurable per op)	Tunable (read concern)	Strong on single node; eventual on cluster	Strong (via HBase MVCC)
Write throughput	Very high (LSM tree, no leader)	Very high (managed, serverless scale)	Medium-high	Extremely high (in-memory)	Very high
Read latency	~1–5ms (disk-backed)	~1–5ms (managed SSD)	~1–10ms	Sub-millisecond (in-memory)	~1–10ms
Horizontal scale	Yes — consistent hashing, no master	Yes — fully managed partitioning	Yes (sharding)	Yes (Redis Cluster)	Yes (HDFS-backed)
Replication	Configurable replication factor (RF=3 typical)	Managed multi-AZ	Replica sets	Master-replica + Sentinel / Cluster	Managed by HBase
ACID transactions	Lightweight transactions (LWT) only	Single-item; multi-item transactions added	Multi-doc transactions (4.0+)	Single-command atomic; Lua scripts for multi-step	No
Best for	Time-ordered data, chat history, events, metrics	Serverless / AWS-native, user sessions, game state	Flexible schema, content, catalogs	Caching, leaderboards, session store, counters	Hadoop ecosystem, analytics over HDFS
SDI chapters	Chat (Ch12), Metrics (V2 Ch05), YouTube analytics	Hotel booking (V2 Ch07), Gaming leaderboard (V2 Ch10)	Google Maps POI (V2 Ch03)	Everywhere as cache; leaderboard (V2 Ch10)	Chat (Ch12 — HBase as alternative to Cassandra)

4. Which Storage Does Each SDI System Use and Why

Chapter	Primary DB	Secondary storage	Cache	Object storage	Key design decision
Vol 1 Ch08 — URL Shortener	MySQL (shortCode → longURL)	—	Redis (hot redirects)	—	Relational for uniqueness constraints + simple schema; Redis absorbs 80% of reads
Vol 1 Ch10 — Notification System	MySQL (user prefs, device tokens)	—	Redis (device token cache)	—	Relational for structured user data; notification state in DB for deduplication
Vol 1 Ch11 — News Feed	MySQL (users, friendships)	Cassandra / Redis (post storage, timelines)	Redis (prebuilt feed per user)	S3 (images, videos in posts)	Social graph in SQL; high-volume post fan-out in Redis/Cassandra
Vol 1 Ch12 — Chat System	MySQL / PostgreSQL (user accounts, groups)	Cassandra / HBase (message history)	Redis (presence, last-seen, recent messages)	—	Time-ordered messages → wide-column DB; presence ephemeral in Redis
Vol 1 Ch13 — Search Autocomplete	MySQL (search log analytics)	Trie / prefix cache (in-memory or Redis)	Redis (top-K suggestions per prefix)	—	Trie structure for prefix matching; Redis caches hot prefixes
Vol 1 Ch14 — YouTube	MySQL (video metadata, user data)	Cassandra (view counts, analytics)	Redis (hot metadata) CDN (video segments)	S3 (raw + transcoded video files)	Binary blobs to S3; structured metadata to MySQL; high-write analytics to Cassandra
Vol 1 Ch15 — Google Drive	MySQL (file metadata, user data, permissions)	Block storage (file chunks)	Redis (metadata cache, upload session state)	S3 / GCS (file chunks / blocks)	Files split into blocks; metadata relational; strong consistency for metadata
Vol 2 Ch03 — Google Maps	PostgreSQL + PostGIS (map geometry)	MongoDB / custom tile store	CDN (map tiles)	S3 (satellite imagery, tiles)	Geospatial queries need PostGIS; tiles are static blobs for CDN
Vol 2 Ch05 — Metrics Monitoring	InfluxDB / TimescaleDB / Cassandra (metric time-series)	—	Redis (recent aggregates for dashboards)	S3 (cold metric archive)	Write-heavy time-series → dedicated TSDB; downsampling over time
Vol 2 Ch06 — Ad Click Aggregation	MySQL (ad definitions, billing records)	Cassandra / ClickHouse (aggregated click data)	Redis (real-time counters, top-100 sorted set)	S3 (raw click event archive)	Raw events in Kafka; streaming aggregation to Redis; batch to ClickHouse for accuracy
Vol 2 Ch07 — Hotel Reservation	MySQL (hotels, rooms, reservations)	Redis (room inventory cache)	Redis (availability cache)	—	ACID critical (double booking = disaster); relational with optimistic locking
Vol 2 Ch09 — S3 Object Storage	SQL (metadata: bucket/object manifest)	Custom distributed object store (the chapter IS about this)	—	Itself (the system being designed)	Separates metadata (relational) from data (custom shard-based blob store)
Vol 2 Ch10 — Gaming Leaderboard	MySQL (user profiles, game records)	Redis Sorted Set (live leaderboard)	Redis is both cache and primary store for leaderboard	—	Redis ZADD/ZRANK are O(log N); Redis IS the leaderboard store
Vol 2 Ch11 — Payment System	MySQL / PostgreSQL (accounts, transactions)	—	Redis (idempotency key store)	—	ACID mandatory; double-entry ledger in relational DB; idempotency keys in Redis
Vol 2 Ch12 — Digital Wallet	MySQL / PostgreSQL (wallet balances, transaction log)	—	Redis (balance cache, idempotency)	—	Strong consistency for balance; event sourcing for audit trail
Vol 2 Ch13 — Stock Exchange	MySQL (user accounts, order records)	Kafka (event sourcing / audit log)	Redis (order book snapshots, market data)	—	Matching engine state in Redis for microsecond access; Kafka as immutable event log

5. Database Scaling Patterns

Read Replicas

Route all reads to replicas, writes to primary
Replication lag = eventual consistency on reads
SDI use: Almost every chapter (URL shortener, news feed, notifications)
Rule of thumb: Use when read:write ratio > 5:1

Sharding Strategies

Strategy	How it works	Pros	Cons	SDI chapters
Range-based	Shard by value range (user IDs 0-1M → shard 1)	Easy range queries	Hotspots if data is skewed	Web crawler (crawl by URL lexicographic range)
Hash-based	`shard = hash(key) % N`	Even distribution	No range queries; resharding requires moving all data	URL shortener (by hash of shortCode)
Consistent hashing	Hash ring; each server owns an arc	Adding/removing servers moves minimal data	More complex; needs virtual nodes for balance	Key-value store (Ch06), distributed cache (Ch05)
Geographic	Route users to region-local shard	Low latency; data residency compliance	Cross-region queries fail; uneven distribution	Google Maps (region-based tiles), Hotel reservation (region-based hotels)
Directory-based	Lookup service maps key → shard	Flexible, no re-hashing	Lookup service is a bottleneck and SPOF	Large-scale systems with complex routing

Vertical vs Horizontal Scaling

Aspect	Vertical (scale up)	Horizontal (scale out)
Mechanism	Bigger machine (more CPU, RAM, disk)	More machines (add nodes)
Limit	Hard ceiling (~several hundred GB RAM)	Virtually unlimited
Cost	Expensive per unit	Cheaper at scale
Complexity	Simple (no distributed problems)	High (sharding, consistency, resharding)
When to use	First step — always exhaust vertical first	When vertical is maxed or cost-prohibitive
SDI threshold	Start vertical up to ~10K QPS	Shard when single DB can’t keep up

6. Consistency Models Recap

Model	What it guarantees	Latency trade-off	When it appears in SDI
Strong consistency	Read always returns the latest write	Higher (must coordinate across replicas)	Payment (Ch11), Hotel reservation (Ch07), Stock exchange (Ch13) — financial data cannot be stale
Sequential consistency	All operations appear to happen in some sequential order consistent across all nodes	Medium	Distributed KV stores (Ch06) — order matters for correctness
Causal consistency	Causally related operations seen in causal order; unrelated operations may differ	Medium	Chat systems — message B sent in reply to A must arrive after A
Read-your-writes	After a write, same client’s reads reflect that write	Medium	User profile updates — user expects to see their own edits immediately
Eventual consistency	All replicas converge eventually if writes stop	Low	News feed (Ch11), metrics (V2 Ch05), URL shortener (Ch08) — slight staleness acceptable
Monotonic reads	Once a value is read, subsequent reads never return an older value	Medium	Feed loading — don’t let users see “older” posts appear after newer ones

7. Storage Estimation Quick Reference

Data type	Typical size	Notes
UUID / GUID	16 bytes	Preferred over auto-increment for distributed systems
Int (32-bit)	4 bytes	User ID, counter
Long (64-bit)	8 bytes	Timestamp, large ID
URL (average)	~100 bytes	Short URL ~20 bytes, long URL ~200 bytes
Tweet / short message	~280 bytes (chars)	~500 bytes with metadata
Chat message	~100–500 bytes	Text + metadata
User profile row	~1 KB	All fields combined
Small image (thumbnail)	~100 KB	Profile photo, thumbnail
Image (full resolution)	~1–5 MB	Photos
Video (1 min, 720p)	~100 MB	H.264 encoded
Video (1 hr, 1080p)	~5–10 GB	Before CDN chunking
Metric data point	~100 bytes	Timestamp + value + labels
Log entry	~200–500 bytes	Level + message + context

Back-of-envelope formula:

Storage needed = daily_writes × avg_entry_size × retention_days × replication_factor

Example — Chat system (Vol 1 Ch12):

50M DAU × 40 messages/day × 200 bytes × 365 days × 3 replicas
= 50M × 40 × 200 × 365 × 3
≈ 438 TB / year
→ Need sharded Cassandra cluster

8. CAP Theorem and SDI Chapter Mapping

CAP theorem: A distributed data store can only guarantee two of three properties simultaneously:

Consistency: Every read returns the latest write (or an error)
Availability: Every request returns a response (may be stale)
Partition tolerance: System continues operating despite network partitions

Network partitions are unavoidable in distributed systems → you always have P. The real choice is CP vs AP.

System	CAP choice	Reasoning	SDI chapter
MySQL (relational DB)	CP	Returns error rather than stale data; failover can cause brief unavailability	Payment (Ch11), Hotel reservation (V2 Ch07), stock exchange (V2 Ch13)
Cassandra	AP	Tunable; default is AP (eventual consistency); can configure CP with `QUORUM`	Chat history (Ch12), metrics (V2 Ch05)
DynamoDB	AP (default) / CP (optional)	Eventual consistency reads by default; strong consistency available	Hotel booking (V2 Ch07), gaming leaderboard (V2 Ch10)
HBase	CP	Strong consistency for row-level reads/writes	Chat system (Ch12 — HBase alternative)
Redis	CP (single node) / AP (Cluster)	Single node: strong; Cluster in split-brain: may serve stale	Rate limiter (Ch04), leaderboard (V2 Ch10)
ZooKeeper	CP	Used for leader election, config; guarantees linearizability	Key-value store (Ch06), service registry

Practical SDI interview guidance:

Financial systems (payment, stock exchange, hotel reservation): Always choose CP. “I would rather fail the request than process a stale balance.”
Social/analytics systems (news feed, metrics, leaderboard): AP is fine. “A slightly stale feed is acceptable; availability matters more.”
The real world uses tunable consistency (Cassandra QUORUM, DynamoDB strong reads) to get the level needed per operation.

9. Storage System Selection Flowchart

What kind of data are you storing?
  │
  ├── Binary blobs (video, images, files > 1 MB)
  │     → Object storage (S3, GCS)
  │       + CDN in front for reads
  │
  ├── Structured relational data (users, orders, reservations)
  │     → Relational DB (MySQL / PostgreSQL)
  │       + Read replicas for scale
  │       + Redis cache for hot reads
  │
  ├── Time-ordered events (messages, metrics, logs, clicks)
  │     → Wide-column DB (Cassandra / HBase)
  │       or Time-series DB (InfluxDB / TimescaleDB) for metrics
  │
  ├── Simple key-value (sessions, tokens, counters, feature flags)
  │     → Redis (in-memory, sub-ms)
  │       or DynamoDB (persistent, serverless)
  │
  ├── Full-text searchable content (products, articles, logs)
  │     → Elasticsearch / OpenSearch
  │       (sync from primary DB via CDC or dual-write)
  │
  ├── Graph relationships (social network, fraud detection)
  │     → Graph DB (Neo4j, Neptune)
  │       or simulate with relational adjacency table at small scale
  │
  └── Flexible / evolving schema (catalog, CMS, user-generated content)
        → Document DB (MongoDB / DynamoDB document mode)

Layered storage architecture (appears in most complex SDI chapters):

[Hot path — sub-ms]       Redis (in-memory cache)
       ↑ cache miss
[Warm path — 1-10ms]      Primary DB (MySQL / Cassandra)
       ↑ cold data
[Cold path — 10-100ms]    Object storage / Data warehouse (S3 / ClickHouse)
       ↑ batch analytics
[Archive — minutes]       Glacier / cheap object storage (7-year retention)

See also:

key-patterns > 4. SQL vs NoSQL — Decision guide for SQL vs NoSQL
key-patterns > 2. Database Sharding — Sharding strategies detail
key-patterns > 3. Consistent Hashing — Hash ring pattern
distributed-system-components > 13. Databases — SQL and NoSQL overview
distributed-system-components > 16. Object Storage — S3 and blob storage
estimation-cheatsheet — Storage and throughput numbers
ch04-distributed-message-queue — Storage for message persistence
ch09-s3-object-storage — Object storage internals

Study Notes by Niladri & AI

Explorer

storage-systems