Modern Trends & Deviations from DDIA (2017 → 2026)
Overview
The book was published in March 2017. This document tracks significant changes, new technologies, and evolved practices in data-intensive applications since then.
Major Technology Shifts
Cloud-Native & Serverless
Book Context (2017):
- Cloud computing was established but still maturing
- Infrastructure management was more manual
Current State (2026):
- Serverless databases and computing are mainstream
- Multi-cloud and hybrid cloud architectures
- Kubernetes and containerization standard
- Managed services for most data infrastructure
Stream Processing
Book Context (2017):
- Apache Kafka, Apache Flink emerging
- Stream processing less mature
Current State (2026):
- Real-time data processing is default
- Event-driven architectures widespread
- Apache Kafka ubiquitous
NewSQL & Distributed Databases
Book Context (2017):
- CAP theorem trade-offs central
- Limited distributed SQL options
Current State (2026):
- CockroachDB, YugabyteDB, TiDB mature
- Better consistency + scalability
- Spanner-inspired systems common
AI/ML Integration
Book Context (2017):
- ML mostly separate concern
- Limited mention of AI workloads
Current State (2026):
- Vector databases for embeddings (Pinecone, Weaviate, pgvector)
- ML feature stores standard
- Real-time inference pipelines
- LLM-specific data infrastructure
Chapter-Specific Updates
Chapter 1: Reliability, Scalability, Maintainability
Reliability Evolution:
- Book (2017): Focus on hardware redundancy and basic fault tolerance
- Now (2026):
- Chaos Engineering mainstream (intentionally inject failures to test resilience)
- SRE practices standard (error budgets, SLOs/SLIs)
- Observability platforms (Datadog, New Relic, Honeycomb) vs simple monitoring
- AIOps using ML to predict and prevent failures
- Multi-region active-active architectures for disaster recovery
Scalability Evolution:
- Book (2017): Manual scaling decisions, load balancers, database sharding
- Now (2026):
- Kubernetes auto-scaling (HPA, VPA, cluster autoscaler)
- Serverless completely abstracts scaling (AWS Lambda, Cloud Run)
- Event-driven architectures for better scale (KEDA, Knative)
- Global edge networks (Cloudflare Workers, Fastly Compute@Edge)
- FinOps - cost optimization now as important as performance
- Predictive auto-scaling using ML
Maintainability Evolution:
- Book (2017): Focus on documentation and good practices
- Now (2026):
- Platform Engineering teams (Internal Developer Platforms)
- Infrastructure as Code mandatory (Terraform, Pulumi, CDK)
- GitOps for operations (everything in git)
- Developer Experience (DevEx) metrics tracked
- AI code assistants (GitHub Copilot, Claude Code)
- Automated dependency updates (Dependabot, Renovate)
- Supply chain security (SBOM, vulnerability scanning)
New Considerations Not in Book:
- Sustainability: Carbon-aware computing, green cloud regions
- Compliance: GDPR, data residency, right to deletion
- Cost Attribution: FinOps practices, cost per feature/tenant
- Developer Productivity: DORA metrics, DevEx benchmarking
Chapter 2: Data Models
- Updates:
Chapter 3: Storage & Retrieval
- Updates:
Technologies to Explore
- Vector databases
- DuckDB for OLAP
- Delta Lake, Apache Iceberg
- dbt for data transformations
- Modern observability (OpenTelemetry)
Deprecated or Declining
- Technologies mentioned in book that are less relevant now
Last Updated: 2026-04-08