Chapter 16: Space-Based Architecture

fsa architecture-styles space-based scalability in-memory

Status: Notes complete


Overview

Space-Based Architecture (SBA) is designed specifically to address extreme scalability and elasticity requirements — scenarios where concurrent user load is highly variable and can spike massively (e.g., a concert going on sale, a flash sale, a major online auction event). It achieves this by removing the database from the hot path entirely, replacing it with replicated in-memory data grids distributed across processing units.

The name derives from the concept of tuple spaces from parallel computing — a shared memory space partitioned across distributed nodes. Each processing unit holds a complete copy of application state in memory, enabling any unit to handle any request without coordination.


Topology

         [Users / Clients]
               |
               v
    +---------------------+
    |   Messaging Grid    |  <-- routes requests to available processing units
    +---------------------+
         |          |
         v          v
  +-----------+  +-----------+
  |  PU-1     |  |  PU-2     |   Processing Units (stateful, in-memory)
  | [App Code]|  | [App Code]|   Each holds full in-memory data replica
  | [In-Mem   |  | [In-Mem   |
  |  Data Grid]| |  Data Grid]|
  +-----------+  +-----------+
       |    \       /    |
       |     \     /     |
       |   Data Grid     |   <-- replicates state changes between PUs
       |    (sync)       |
       |                 |
       v                 v
  +-----------+     +-----------+
  | Data Pump |     | Data Pump |   async writes to persistence
  +-----------+     +-----------+
         |                 |
         v                 v
  +--------------+  +--------------+
  | Data Writer  |  | Data Reader  |
  +--------------+  +--------------+
              \          /
               v        v
           +------------------+
           |    Database      |   (off hot path — backend only)
           +------------------+

Virtualized Middleware Components

SBA relies on a Virtualized Middleware layer that provides four key grids:

1. Messaging Grid

Routes incoming requests to available processing units. Acts as a load balancer but with awareness of which processing units are available. When load increases and new processing units spin up, the messaging grid automatically begins routing requests to them. When load decreases and units shut down, it routes away from them.

2. Data Grid

The most critical component. Manages data replication between all active processing units. When one processing unit updates its in-memory data (e.g., a user updates their account), the data grid ensures all other processing units receive that update synchronously or near-synchronously. All units maintain a consistent in-memory view of the data.

Key responsibilities:

  • Replication of state changes across processing units
  • Conflict resolution (last-write-wins or custom strategies)
  • Data partitioning when needed
  • Memory management

3. Processing Grid (Optional)

When a request requires orchestration across multiple specialized processing units, the Processing Grid mediates the interaction. It orchestrates parallel processing by coordinating between different processing unit types (e.g., product processing unit, order processing unit) for a single request.

4. Deployment Manager

Manages dynamic scaling of processing units based on load. Monitors CPU, memory, throughput, and request queue depth. Spins up new processing unit instances when load increases and removes them when load decreases. Integrates with cloud auto-scaling infrastructure (e.g., Kubernetes Horizontal Pod Autoscaler).


Style Specifics

Processing Unit (PU)

The core component of SBA. A processing unit is a self-contained, stateful application instance that contains:

  • Application/business logic
  • An in-memory data grid (local cache / distributed cache shard)
  • Messaging grid listener (to receive routed requests)
  • Data pump (to async-write changes to the database)

Multiple processing unit instances run in parallel. Because each holds a complete in-memory copy of relevant data, any PU can serve any request independently — no coordination needed for reads.

In-Memory Data Grid

Each processing unit’s local in-memory data store holds the complete working dataset. This eliminates database calls for read operations (the dominant operation in most web applications), replacing millisecond database round trips with nanosecond memory lookups.

Technologies: Hazelcast, Apache Ignite, Oracle Coherence, Redis (with replication), GemFire/Apache Geode.

Data Pump

The Data Pump is the async bridge between in-memory state and persistent database storage. It sends data change events (writes, updates, deletes) from a processing unit to the Data Writer via messaging. This is a non-blocking, asynchronous operation — the processing unit never waits for the database write to complete.

The Data Pump ensures the database is eventually updated but never blocks user-facing processing.

Data Writer

Receives data change events from the Data Pump and persists them to the database. Handles the translation from in-memory data structures to database rows. May handle batching, conflict resolution, and retry logic for failed writes.

Data Reader

Loads data from the database into the in-memory data grid at startup time (cold start) or when a new processing unit instance is created. After initial load, the data grid replication handles state propagation — the Data Reader is only needed at initialization.


Data Topologies

Space-Based Architecture is primarily an in-memory architecture. The canonical data topology:

  • In-memory data grids across processing units hold the hot working set (the data being actively operated on)
  • Single shared database (relational or otherwise) serves as the durable backend — the system of record for persistence
  • Data flows: User request → Processing Unit (in-memory) → Data Pump (async) → Data Writer → Database
  • On startup: Database → Data Reader → In-memory Data Grid

The database is off the hot path — it receives writes asynchronously and is only read at startup. This is the fundamental inversion that enables extreme scalability.


Cloud Considerations

ConcernCloud Approach
In-memory data gridHazelcast Cloud, Azure Cache for Redis (clustered), AWS ElastiCache
Processing unit orchestrationKubernetes (HPA for auto-scaling), AWS ECS, Azure AKS
Data persistence backendAny relational/NoSQL DB (Aurora, Azure SQL, Cosmos DB)
Deployment managerKubernetes HPA + custom metrics, KEDA (Kubernetes Event-Driven Autoscaling)
Processing grid coordinationService mesh (Istio), custom orchestration

Kubernetes is the natural fit for SBA: processing units become Pods, the HPA implements the Deployment Manager, and the data grid runs as a StatefulSet or external managed service.


Common Risks

Frequent Database Reads on Startup: When many processing unit instances start simultaneously (e.g., after a full outage or major scale-out event), all instances trigger Data Readers concurrently, potentially overwhelming the database. Mitigations: staggered startup, pre-warm strategies, read replicas.

Data Synchronization Issues: The data grid must replicate state changes to all active processing units. Network partitions or slow replication can cause processing units to operate on stale data, leading to inconsistencies. This is the central consistency risk of SBA.

High Data Volume / Memory Limits: In-memory storage is expensive and limited. If the working dataset grows beyond available memory across processing units, the architecture faces severe pressure — either memory expansion costs or data eviction strategies (working only on subsets of data).

Data Collisions: When two processing units simultaneously modify the same data record (e.g., two users booking the last ticket), collision handling is needed. Requires optimistic locking, versioning, or last-write-wins strategies, all of which add complexity.

Cache-Database Inconsistency Window: Between the time a processing unit writes to in-memory and the Data Pump delivers to the database, a window exists where in-memory and database are inconsistent. A crash during this window causes data loss. Mitigations: durable messaging for the data pump, write-ahead logging.

Testing Complexity: Distributed in-memory state is difficult to reproduce in test environments. The behavior of data grid replication, collision resolution, and startup data loading must all be explicitly tested.


Governance

  • Define data partitioning strategy up front: which data lives in which in-memory grid, how large is the working set, what data can be evicted.
  • Establish collision resolution policies for every entity type: last-write-wins, optimistic locking, or custom merge strategies.
  • Define startup sequencing: Data Readers must complete loading before processing units begin serving requests. Use readiness probes in Kubernetes.
  • Set memory budgets per processing unit and alert thresholds — memory pressure is a leading indicator of architectural breakdown.
  • Define data pump SLAs: maximum acceptable lag between in-memory write and database persistence.
  • Enforce idempotent Data Writers — the data pump may deliver the same event more than once.

Team Topology

SBA tends toward a centralized, platform-oriented team structure. The in-memory data grid, data pump, and deployment manager are shared infrastructure components that a platform team owns. Application teams implement processing unit logic.

The tight coupling of in-memory state across all processing units (via the data grid) makes SBA less amenable to the fully independent team ownership model of microservices. It is better suited to a single product team or a small number of tightly coordinated teams.


Architectural Characteristics Ratings

CharacteristicRatingNotes
Agility★★☆☆☆Changes to data model affect all processing units simultaneously; coordinated deployment needed
Deployability★★★☆☆Processing units can be independently deployed; data grid coordination adds complexity
Testability★★☆☆☆In-memory state replication and collision scenarios are difficult to reproduce in tests
Performance★★★★★Memory-speed data access, no database on hot path — peak performance architecture
Scalability★★★★★Dynamic addition/removal of processing units; near-linear horizontal scalability
Development Ease★★☆☆☆Complex infrastructure, collision handling, startup sequencing, data grid expertise required
Simplicity★☆☆☆☆One of the most operationally complex architecture styles; significant infrastructure overhead
Cost★☆☆☆☆In-memory infrastructure (data grids, multiple PU instances) is expensive; cloud costs high

When to Use

  • Applications with extremely high, variable concurrent user load (concert ticket sales, flash sales, online auctions)
  • Systems where database throughput is the proven scalability bottleneck and cannot be addressed otherwise
  • Use cases where the working dataset fits in distributed memory across the processing unit farm
  • Real-time applications requiring sub-millisecond read latency that databases cannot provide
  • Systems that can tolerate eventual consistency between in-memory state and persistent storage

When Not to Use

  • Applications with large datasets that cannot fit in distributed memory
  • Systems requiring strong, immediate consistency between persistent storage and application state
  • Cost-sensitive applications — the infrastructure cost is very high
  • Applications with infrequent, steady-state load that can be handled by a well-tuned traditional architecture
  • Small teams without expertise in distributed data grid technologies
  • Systems where data is the primary asset and any consistency window is unacceptable (financial ledgers, healthcare records)

Examples and Use Cases

  • Concert/Event Ticketing Systems: Massive concurrent load when popular events go on sale. SBA enables tens of thousands of simultaneous seat reservation requests processed in memory, with the database updated asynchronously.
  • Online Auction Platforms: High concurrency bidding with real-time state updates. Each bid updates in-memory state across all processing units instantly, database updated asynchronously.
  • Online Gaming Leaderboards: Real-time score updates for millions of concurrent players — reads and writes all in memory, database synced asynchronously.
  • High-Frequency Trading Support Systems: Pre-trade risk checks and position management requiring sub-millisecond latency — database latency is architecturally unacceptable.
  • Flash Sale E-Commerce: Brief periods of extreme load (Black Friday spikes) requiring instant horizontal scale-out.

Key Takeaways

  1. Database off the Hot Path: The fundamental design principle of SBA — remove the database from user-facing request processing by replacing it with replicated in-memory data grids.
  2. Four Virtualized Middleware Components: Messaging Grid (routing), Data Grid (state replication), Processing Grid (optional orchestration), Deployment Manager (dynamic scaling) — together they constitute the SBA infrastructure.
  3. Processing Unit is the Atom: Each PU is a self-contained stateful application instance with its own in-memory data copy — any PU can serve any request.
  4. Data Pump is Async: Writes to the database are asynchronous via the Data Pump — the PU never waits for database persistence, enabling maximum throughput.
  5. Data Reader is for Startup Only: Data Readers load state from the database into memory when a new PU starts. Afterward, the Data Grid handles state via replication.
  6. Memory is the Constraint: Working dataset must fit in distributed memory — when data volume exceeds memory capacity, the architecture breaks down.
  7. Highest Performance and Scalability: Like EDA, SBA achieves 5-star ratings for both — it is one of only two styles in FSA to achieve this for both characteristics.
  8. Highest Cost: The in-memory infrastructure (multiple PU instances, data grid licensing/management) makes SBA the most expensive architecture style to operate.
  9. Collision Handling Is Required: Concurrent writes to the same data from different PUs require an explicit conflict resolution strategy (optimistic locking, versioning, last-write-wins).
  10. Niche but Irreplaceable: SBA is a specialized tool for extreme scalability scenarios. For most applications, simpler styles are more appropriate. When the use case fits (ticket sales, auctions), no other style can match it.

  • Chapter 15: Event-Driven Architecture (complementary async style; data pumps in SBA often use EDA patterns)
  • Chapter 17: Choosing the Right Architecture Style
  • Hazelcast documentation — distributed in-memory computing
  • Apache Ignite documentation — distributed database and caching
  • Oracle Coherence documentation
  • Kubernetes HPA documentation — auto-scaling for processing units

Last Updated: 2026-05-29