What's the primary concern?
│
├─ Data is tree-shaped (one-to-many, few cross-references)?
│ └─ → Document model (MongoDB, Firestore, CouchDB)
│
├─ Complex many-to-many joins or strong aggregate queries?
│ └─ → Relational model (PostgreSQL, MySQL, CockroachDB)
│
├─ Relationships ARE the data; traversal is primary query?
│ └─ → Graph model (Neo4j, Amazon Neptune, Dgraph)
│
├─ Need full audit trail + temporal queries + multiple views?
│ └─ → Event Sourcing + CQRS (Kafka, Axon, EventStoreDB)
│
├─ Analytical aggregates over billions of rows?
│ └─ → Column-oriented DW with star schema (Snowflake, BigQuery)
│
└─ Flexible client-driven queries over existing backend?
└─ → GraphQL API layer (Hasura, Apollo, PostGraphile)
Normalization vs Denormalization
Normalized (Relational) Denormalized (Document / OLAP)
────────────────────── ──────────────────────────────
city stored as ID → cities table city name stored in every row
Pro: No duplication, easy updates Pro: Fast reads, no joins
Pro: IDs stable even if value changes Pro: Simple queries
Con: JOIN required for reads Con: Duplication, costly updates
Use: OLTP, frequently changing data Use: OLAP, read-heavy, rarely changes
Schema Trade-offs
Schema-on-write (Relational) Schema-on-read (Document)
──────────────────────────── ─────────────────────────
DB validates every INSERT/UPDATE Application validates at read time
ALTER TABLE to change structure Add new fields freely, no migration
Slow migrations on large tables Instant "migration" (just add fields)
Safer for uniform, structured data Better for heterogeneous, evolving data
Catch errors early Bugs surface at read time
Cypher vs SQL: Same Query, Different Languages
Goal: Find people born in France who live in UK cities
SQL (verbose, requires self-joins):
SELECT p.name, c.name
FROM persons p
JOIN locations born ON p.born_in = born.id
JOIN locations city ON p.lives_in = city.id
JOIN locations country ON city.country_id = country.id
WHERE born.name = 'France' AND country.name = 'UK';
Cypher (concise, reads like the domain):
MATCH (p:Person)-[:BORN_IN]->(:Country {name:'France'}),
(p)-[:LIVES_IN]->(:City)-[:IN]->(:Country {name:'UK'})
RETURN p.name
Property Graph Building Blocks
Vertex Edge
────────────────── ──────────────────────────────
• Unique ID • Unique ID
• Set of outgoing edges • Tail vertex (origin)
• Set of incoming edges • Head vertex (destination)
• Key-value properties • Label (describes relationship)
• Key-value properties
Example:
(Alice:Person {age:30}) -[:WORKS_AT {since:2020}]-> (Acme:Company {sector:'Tech'})