Storage Systems: Cross-Chapter Comparison Reference

comparisons storage database sql nosql

Type: Cross-chapter reference — NOT a chapter notes file
Covers: Vol 1 Ch01, Ch06, Ch08, Ch12, Ch13, Ch14, Ch15 · Vol 2 Ch05, Ch06, Ch07, Ch09, Ch10, Ch11, Ch13
Last Updated: 2026-04-13


1. Storage Type Overview

TypeRepresentative systems1-line descriptionBest use case
Relational DBMySQL, PostgreSQL, OracleStructured tables with ACID transactions and joinsUser accounts, orders, financial ledgers, anything needing strong consistency + complex queries
Document DBMongoDB, Couchbase, DynamoDB (doc mode)Schema-flexible JSON documents, nested structuresUser profiles, product catalogs, content management — flexible evolving schema
Wide-column DBCassandra, HBase, ScyllaDBRows with dynamic columns; optimized for time-series and high write throughputChat messages, activity feeds, time-series metrics, IoT events
Key-value storeRedis, DynamoDB (KV mode), MemcachedPure key → value lookups, sub-millisecond latencySessions, caches, rate limit counters, feature flags
Time-series DBInfluxDB, TimescaleDB, PrometheusOptimized for timestamped data; built-in downsamplingInfrastructure metrics, IoT sensors, financial tick data
Graph DBNeo4j, Amazon Neptune, JanusGraphNodes and edges; optimized for traversal queriesSocial graphs, fraud detection, recommendation engines, knowledge graphs
Object storageAWS S3, Google Cloud Storage, Azure BlobFlat key-based storage for large blobs, accessed via HTTPVideos, images, backups, data lake, large file downloads
Block storageAWS EBS, GCP Persistent DiskRaw blocks attached to VMs, like a virtual diskOS volumes, database data files, low-latency VM storage
Search engineElasticsearch, OpenSearch, SolrInverted index for full-text search; faceting and rankingProduct search, log analytics, autocomplete, content discovery
Data warehouseSnowflake, BigQuery, Redshift, ClickHouseColumnar OLAP; optimized for analytical queries on large historical datasetsBI dashboards, ad hoc analytics, business reporting

2. SQL vs NoSQL Decision Guide

Choose SQL (Relational) when:

  • Data has clear relationships (foreign keys, joins matter)
  • You need ACID transactions (bank transfers, e-commerce orders, hotel reservations)
  • Schema is stable and well-defined upfront
  • Complex queries (multi-table joins, aggregations, reporting)
  • Data volume fits one powerful server or a few replicas
  • Consistency is more important than availability

Typical SDI examples: User accounts (everywhere), hotel reservations (Vol 2 Ch07), payment ledger (Vol 2 Ch11), order history (Vol 2 Ch07)

Choose NoSQL when:

  • Schema is flexible or evolving rapidly
  • You need horizontal write scalability beyond what one DB can handle
  • Queries are simple (key lookups, range scans by primary key)
  • Eventual consistency is acceptable
  • High write throughput is required (>10K writes/sec)
  • Time-series or append-only data patterns

Typical SDI examples: Chat messages (Cassandra — Vol 1 Ch12), metrics (InfluxDB/Cassandra — Vol 2 Ch05), video metadata analytics (Cassandra — Vol 1 Ch14)

Hybrid approaches (most large systems):

  • SQL for user accounts + metadata (strong consistency needed)
  • NoSQL for activity data + events (high volume, eventual consistency OK)
  • Redis as cache layer in front of both
  • Object storage for binary blobs (videos, files)

3. NoSQL Options Comparison Table

FeatureCassandraDynamoDBMongoDBRedisHBase
Data modelWide-column (partition key + clustering columns)Key-value / document hybridJSON documentsKey-value + rich structures (lists, sets, sorted sets)Wide-column (Hadoop-backed)
Query flexibilityLow — must design table for query patternsLow — query only by PK or GSIHigh — rich query language, aggregation pipelineLow — key lookups + range within sorted setsLow — row key + column family scans
Consistency modelTunable (ONE, QUORUM, ALL)Strong or eventual (configurable per op)Tunable (read concern)Strong on single node; eventual on clusterStrong (via HBase MVCC)
Write throughputVery high (LSM tree, no leader)Very high (managed, serverless scale)Medium-highExtremely high (in-memory)Very high
Read latency~1–5ms (disk-backed)~1–5ms (managed SSD)~1–10msSub-millisecond (in-memory)~1–10ms
Horizontal scaleYes — consistent hashing, no masterYes — fully managed partitioningYes (sharding)Yes (Redis Cluster)Yes (HDFS-backed)
ReplicationConfigurable replication factor (RF=3 typical)Managed multi-AZReplica setsMaster-replica + Sentinel / ClusterManaged by HBase
ACID transactionsLightweight transactions (LWT) onlySingle-item; multi-item transactions addedMulti-doc transactions (4.0+)Single-command atomic; Lua scripts for multi-stepNo
Best forTime-ordered data, chat history, events, metricsServerless / AWS-native, user sessions, game stateFlexible schema, content, catalogsCaching, leaderboards, session store, countersHadoop ecosystem, analytics over HDFS
SDI chaptersChat (Ch12), Metrics (V2 Ch05), YouTube analyticsHotel booking (V2 Ch07), Gaming leaderboard (V2 Ch10)Google Maps POI (V2 Ch03)Everywhere as cache; leaderboard (V2 Ch10)Chat (Ch12 — HBase as alternative to Cassandra)

4. Which Storage Does Each SDI System Use and Why

ChapterPrimary DBSecondary storageCacheObject storageKey design decision
Vol 1 Ch08 — URL ShortenerMySQL (shortCode → longURL)Redis (hot redirects)Relational for uniqueness constraints + simple schema; Redis absorbs 80% of reads
Vol 1 Ch10 — Notification SystemMySQL (user prefs, device tokens)Redis (device token cache)Relational for structured user data; notification state in DB for deduplication
Vol 1 Ch11 — News FeedMySQL (users, friendships)Cassandra / Redis (post storage, timelines)Redis (prebuilt feed per user)S3 (images, videos in posts)Social graph in SQL; high-volume post fan-out in Redis/Cassandra
Vol 1 Ch12 — Chat SystemMySQL / PostgreSQL (user accounts, groups)Cassandra / HBase (message history)Redis (presence, last-seen, recent messages)Time-ordered messages → wide-column DB; presence ephemeral in Redis
Vol 1 Ch13 — Search AutocompleteMySQL (search log analytics)Trie / prefix cache (in-memory or Redis)Redis (top-K suggestions per prefix)Trie structure for prefix matching; Redis caches hot prefixes
Vol 1 Ch14 — YouTubeMySQL (video metadata, user data)Cassandra (view counts, analytics)Redis (hot metadata) CDN (video segments)S3 (raw + transcoded video files)Binary blobs to S3; structured metadata to MySQL; high-write analytics to Cassandra
Vol 1 Ch15 — Google DriveMySQL (file metadata, user data, permissions)Block storage (file chunks)Redis (metadata cache, upload session state)S3 / GCS (file chunks / blocks)Files split into blocks; metadata relational; strong consistency for metadata
Vol 2 Ch03 — Google MapsPostgreSQL + PostGIS (map geometry)MongoDB / custom tile storeCDN (map tiles)S3 (satellite imagery, tiles)Geospatial queries need PostGIS; tiles are static blobs for CDN
Vol 2 Ch05 — Metrics MonitoringInfluxDB / TimescaleDB / Cassandra (metric time-series)Redis (recent aggregates for dashboards)S3 (cold metric archive)Write-heavy time-series → dedicated TSDB; downsampling over time
Vol 2 Ch06 — Ad Click AggregationMySQL (ad definitions, billing records)Cassandra / ClickHouse (aggregated click data)Redis (real-time counters, top-100 sorted set)S3 (raw click event archive)Raw events in Kafka; streaming aggregation to Redis; batch to ClickHouse for accuracy
Vol 2 Ch07 — Hotel ReservationMySQL (hotels, rooms, reservations)Redis (room inventory cache)Redis (availability cache)ACID critical (double booking = disaster); relational with optimistic locking
Vol 2 Ch09 — S3 Object StorageSQL (metadata: bucket/object manifest)Custom distributed object store (the chapter IS about this)Itself (the system being designed)Separates metadata (relational) from data (custom shard-based blob store)
Vol 2 Ch10 — Gaming LeaderboardMySQL (user profiles, game records)Redis Sorted Set (live leaderboard)Redis is both cache and primary store for leaderboardRedis ZADD/ZRANK are O(log N); Redis IS the leaderboard store
Vol 2 Ch11 — Payment SystemMySQL / PostgreSQL (accounts, transactions)Redis (idempotency key store)ACID mandatory; double-entry ledger in relational DB; idempotency keys in Redis
Vol 2 Ch12 — Digital WalletMySQL / PostgreSQL (wallet balances, transaction log)Redis (balance cache, idempotency)Strong consistency for balance; event sourcing for audit trail
Vol 2 Ch13 — Stock ExchangeMySQL (user accounts, order records)Kafka (event sourcing / audit log)Redis (order book snapshots, market data)Matching engine state in Redis for microsecond access; Kafka as immutable event log

5. Database Scaling Patterns

Read Replicas

  • Route all reads to replicas, writes to primary
  • Replication lag = eventual consistency on reads
  • SDI use: Almost every chapter (URL shortener, news feed, notifications)
  • Rule of thumb: Use when read:write ratio > 5:1

Sharding Strategies

StrategyHow it worksProsConsSDI chapters
Range-basedShard by value range (user IDs 0-1M → shard 1)Easy range queriesHotspots if data is skewedWeb crawler (crawl by URL lexicographic range)
Hash-basedshard = hash(key) % NEven distributionNo range queries; resharding requires moving all dataURL shortener (by hash of shortCode)
Consistent hashingHash ring; each server owns an arcAdding/removing servers moves minimal dataMore complex; needs virtual nodes for balanceKey-value store (Ch06), distributed cache (Ch05)
GeographicRoute users to region-local shardLow latency; data residency complianceCross-region queries fail; uneven distributionGoogle Maps (region-based tiles), Hotel reservation (region-based hotels)
Directory-basedLookup service maps key → shardFlexible, no re-hashingLookup service is a bottleneck and SPOFLarge-scale systems with complex routing

Vertical vs Horizontal Scaling

AspectVertical (scale up)Horizontal (scale out)
MechanismBigger machine (more CPU, RAM, disk)More machines (add nodes)
LimitHard ceiling (~several hundred GB RAM)Virtually unlimited
CostExpensive per unitCheaper at scale
ComplexitySimple (no distributed problems)High (sharding, consistency, resharding)
When to useFirst step — always exhaust vertical firstWhen vertical is maxed or cost-prohibitive
SDI thresholdStart vertical up to ~10K QPSShard when single DB can’t keep up

6. Consistency Models Recap

ModelWhat it guaranteesLatency trade-offWhen it appears in SDI
Strong consistencyRead always returns the latest writeHigher (must coordinate across replicas)Payment (Ch11), Hotel reservation (Ch07), Stock exchange (Ch13) — financial data cannot be stale
Sequential consistencyAll operations appear to happen in some sequential order consistent across all nodesMediumDistributed KV stores (Ch06) — order matters for correctness
Causal consistencyCausally related operations seen in causal order; unrelated operations may differMediumChat systems — message B sent in reply to A must arrive after A
Read-your-writesAfter a write, same client’s reads reflect that writeMediumUser profile updates — user expects to see their own edits immediately
Eventual consistencyAll replicas converge eventually if writes stopLowNews feed (Ch11), metrics (V2 Ch05), URL shortener (Ch08) — slight staleness acceptable
Monotonic readsOnce a value is read, subsequent reads never return an older valueMediumFeed loading — don’t let users see “older” posts appear after newer ones

7. Storage Estimation Quick Reference

Data typeTypical sizeNotes
UUID / GUID16 bytesPreferred over auto-increment for distributed systems
Int (32-bit)4 bytesUser ID, counter
Long (64-bit)8 bytesTimestamp, large ID
URL (average)~100 bytesShort URL ~20 bytes, long URL ~200 bytes
Tweet / short message~280 bytes (chars)~500 bytes with metadata
Chat message~100–500 bytesText + metadata
User profile row~1 KBAll fields combined
Small image (thumbnail)~100 KBProfile photo, thumbnail
Image (full resolution)~1–5 MBPhotos
Video (1 min, 720p)~100 MBH.264 encoded
Video (1 hr, 1080p)~5–10 GBBefore CDN chunking
Metric data point~100 bytesTimestamp + value + labels
Log entry~200–500 bytesLevel + message + context

Back-of-envelope formula:

Storage needed = daily_writes × avg_entry_size × retention_days × replication_factor

Example — Chat system (Vol 1 Ch12):

50M DAU × 40 messages/day × 200 bytes × 365 days × 3 replicas
= 50M × 40 × 200 × 365 × 3
≈ 438 TB / year
→ Need sharded Cassandra cluster

8. CAP Theorem and SDI Chapter Mapping

CAP theorem: A distributed data store can only guarantee two of three properties simultaneously:

  • Consistency: Every read returns the latest write (or an error)
  • Availability: Every request returns a response (may be stale)
  • Partition tolerance: System continues operating despite network partitions

Network partitions are unavoidable in distributed systems → you always have P. The real choice is CP vs AP.

SystemCAP choiceReasoningSDI chapter
MySQL (relational DB)CPReturns error rather than stale data; failover can cause brief unavailabilityPayment (Ch11), Hotel reservation (V2 Ch07), stock exchange (V2 Ch13)
CassandraAPTunable; default is AP (eventual consistency); can configure CP with QUORUMChat history (Ch12), metrics (V2 Ch05)
DynamoDBAP (default) / CP (optional)Eventual consistency reads by default; strong consistency availableHotel booking (V2 Ch07), gaming leaderboard (V2 Ch10)
HBaseCPStrong consistency for row-level reads/writesChat system (Ch12 — HBase alternative)
RedisCP (single node) / AP (Cluster)Single node: strong; Cluster in split-brain: may serve staleRate limiter (Ch04), leaderboard (V2 Ch10)
ZooKeeperCPUsed for leader election, config; guarantees linearizabilityKey-value store (Ch06), service registry

Practical SDI interview guidance:

  • Financial systems (payment, stock exchange, hotel reservation): Always choose CP. “I would rather fail the request than process a stale balance.”
  • Social/analytics systems (news feed, metrics, leaderboard): AP is fine. “A slightly stale feed is acceptable; availability matters more.”
  • The real world uses tunable consistency (Cassandra QUORUM, DynamoDB strong reads) to get the level needed per operation.

9. Storage System Selection Flowchart

What kind of data are you storing?
  │
  ├── Binary blobs (video, images, files > 1 MB)
  │     → Object storage (S3, GCS)
  │       + CDN in front for reads
  │
  ├── Structured relational data (users, orders, reservations)
  │     → Relational DB (MySQL / PostgreSQL)
  │       + Read replicas for scale
  │       + Redis cache for hot reads
  │
  ├── Time-ordered events (messages, metrics, logs, clicks)
  │     → Wide-column DB (Cassandra / HBase)
  │       or Time-series DB (InfluxDB / TimescaleDB) for metrics
  │
  ├── Simple key-value (sessions, tokens, counters, feature flags)
  │     → Redis (in-memory, sub-ms)
  │       or DynamoDB (persistent, serverless)
  │
  ├── Full-text searchable content (products, articles, logs)
  │     → Elasticsearch / OpenSearch
  │       (sync from primary DB via CDC or dual-write)
  │
  ├── Graph relationships (social network, fraud detection)
  │     → Graph DB (Neo4j, Neptune)
  │       or simulate with relational adjacency table at small scale
  │
  └── Flexible / evolving schema (catalog, CMS, user-generated content)
        → Document DB (MongoDB / DynamoDB document mode)

Layered storage architecture (appears in most complex SDI chapters):

[Hot path — sub-ms]       Redis (in-memory cache)
       ↑ cache miss
[Warm path — 1-10ms]      Primary DB (MySQL / Cassandra)
       ↑ cold data
[Cold path — 10-100ms]    Object storage / Data warehouse (S3 / ClickHouse)
       ↑ batch analytics
[Archive — minutes]       Glacier / cheap object storage (7-year retention)

See also: