Chapter 16 Flashcards — Space-Based Architecture
flashcards fsa space-based scalability in-memory
What is the core design goal of Space-Based Architecture?
?
To achieve extreme scalability and elasticity by removing the database from the hot path entirely, replacing synchronous database access with replicated in-memory data grids distributed across multiple processing units. Any processing unit can serve any request without database coordination.
Where does the name “Space-Based Architecture” come from?
?
From the concept of tuple spaces in parallel computing — a shared memory space partitioned across distributed nodes. Each processing unit contributes to and reads from this shared “space” of in-memory data, enabling parallel processing without centralized coordination.
What is a Processing Unit (PU) in Space-Based Architecture?
?
A Processing Unit is a self-contained, stateful application instance containing: application/business logic, a local in-memory data grid (full replica of working data), a messaging grid listener, and a data pump for async database writes. Multiple PUs run in parallel — any PU can handle any request.
What are the four components of Virtualized Middleware in SBA?
?
- Messaging Grid — routes incoming requests to available processing units. 2. Data Grid — replicates state changes between all active processing units. 3. Processing Grid — orchestrates parallel/coordinated processing across multiple PU types (optional). 4. Deployment Manager — dynamically spins PU instances up/down based on load.
What is the role of the Data Grid in Space-Based Architecture?
?
The Data Grid is the most critical middleware component. It ensures that when one processing unit updates its in-memory state, all other active processing units receive that update. This keeps all PUs synchronized with a consistent view of the data without querying a database.
What is the role of the Data Pump in Space-Based Architecture?
?
The Data Pump asynchronously sends data change events (writes, updates, deletes) from a processing unit to the Data Writer, which persists them to the database. The Data Pump is non-blocking — the PU never waits for the database write, enabling maximum throughput on the hot path.
What is the role of the Data Reader in Space-Based Architecture?
?
The Data Reader loads data from the database into a processing unit’s in-memory data grid at startup time. After initialization, the Data Grid handles state propagation via replication. The Data Reader is only invoked when a new PU instance starts cold.
What is the role of the Deployment Manager in SBA?
?
The Deployment Manager implements dynamic scaling — it monitors load (CPU, throughput, request queue depth) and automatically spins up new processing unit instances under high load and removes them when load decreases. In cloud environments, it maps to Kubernetes HPA or cloud auto-scaling policies.
What is the role of the Messaging Grid in SBA?
?
The Messaging Grid routes incoming user requests to available processing units, acting as a load balancer with dynamic awareness of which PU instances are currently active. As new PUs come online (scale-out), the messaging grid immediately begins routing requests to them.
What is the fundamental way SBA achieves extreme scalability?
?
By removing the database from the hot path. All reads during request processing are served from in-memory data grids at nanosecond speed. Database writes happen asynchronously via data pumps. Adding more processing units linearly increases capacity — no database bottleneck limits scale.
What technologies implement in-memory data grids in SBA?
?
Common in-memory data grid technologies: Hazelcast, Apache Ignite, Oracle Coherence, GemFire/Apache Geode, and Redis (clustered). These handle distributed in-memory storage, replication, partitioning, and failover across processing unit instances.
What is the Frequent Database Reads risk in SBA?
?
When many processing units start simultaneously (scale-out event or recovery from outage), all instances trigger Data Readers concurrently, potentially overwhelming the database with reads. Mitigations: staggered startup sequences, pre-warm strategies, dedicated read replicas for startup loading.
What is the Data Synchronization risk in SBA?
?
The Data Grid must replicate state changes to all active processing units. Network partitions or replication lag can cause different PUs to operate on stale or inconsistent data, leading to incorrect business outcomes. This is the central consistency risk of SBA.
What is a Data Collision in SBA and how is it handled?
?
A Data Collision occurs when two processing units simultaneously modify the same data record (e.g., two users both attempting to buy the last available ticket). Requires explicit conflict resolution strategies: optimistic locking (version numbers), last-write-wins, or custom merge logic — chosen per entity type.
What is the High Data Volume risk in SBA?
?
SBA stores the working dataset entirely in distributed memory. If the dataset grows beyond available memory across all processing units, the architecture faces a fundamental constraint — either expensive memory expansion or data eviction/segmentation strategies. SBA is unsuitable for very large datasets.
What is the Cache-Database Inconsistency Window risk in SBA?
?
Between the moment a PU writes to in-memory and the Data Pump delivers the update to the database, a consistency window exists. If a processing unit crashes during this window, the in-memory update is lost but the database has not yet been updated — causing data loss. Mitigations: durable messaging for data pumps, write-ahead logs.
What architectural ratings does Space-Based Architecture receive for Performance and Scalability?
?
SBA receives 5 stars (★★★★★) for both Performance and Scalability — the maximum rating. Memory-speed data access with no database on the hot path delivers peak performance; dynamic scaling of stateless (from the database’s perspective) processing units enables near-linear horizontal scalability.
What architectural rating does SBA receive for Cost, and why?
?
SBA receives 1 star (★☆☆☆☆) for Cost — the lowest possible. In-memory infrastructure (distributed data grid licensing and hardware, multiple persistent PU instances, cloud memory costs) is the most expensive among all architecture styles. It is only justified when extreme scalability is a hard requirement.
What architectural rating does SBA receive for Simplicity?
?
SBA receives 1 star (★☆☆☆☆) for Simplicity — tied with EDA for the lowest. The combination of virtualized middleware, data grid replication, collision handling, startup sequencing, data pumps, and dynamic scaling creates enormous operational and developmental complexity.
What are the two defining use cases for Space-Based Architecture?
?
Concert/Event Ticketing (e.g., a popular concert goes on sale — millions of concurrent users attempting seat reservations) and Online Auctions (high-concurrency bidding with real-time state). Both involve extreme concurrent load spikes with a small working dataset that fits in distributed memory.
How does Kubernetes support Space-Based Architecture in the cloud?
?
Processing units become Pods, the Kubernetes Horizontal Pod Autoscaler (HPA) implements the Deployment Manager, the data grid runs as a StatefulSet or managed cloud service (Hazelcast Cloud, Redis), and a Service or Ingress acts as the Messaging Grid. KEDA can extend HPA with custom scaling metrics.
What is the relationship between the Data Pump and Event-Driven Architecture in SBA?
?
The Data Pump typically uses an event-driven/async messaging pattern internally — publishing data change events to a message queue or topic that the Data Writer consumes. This means EDA patterns (acknowledgment, DLQ, idempotency) apply to the persistence layer of SBA.
Why must Data Writers be idempotent in SBA?
?
The Data Pump may deliver the same data change event more than once (at-least-once delivery semantics). If a Data Writer is not idempotent, duplicate events can cause double-writes or incorrect state in the database. Upsert operations and event deduplication are common solutions.
When is Space-Based Architecture the wrong choice?
?
SBA is wrong when: the dataset is too large for distributed memory; strong immediate consistency between memory and database is required; the application has steady, predictable load that a simpler architecture handles fine; cost is a primary constraint; or the team lacks expertise in distributed data grid technologies.
What is the startup sequencing governance requirement for SBA?
?
Data Readers must complete their initial data load before processing units begin accepting requests. In Kubernetes this is enforced via readiness probes — PUs report not-ready until their in-memory data grid is fully populated from the database. Without this, PUs serve requests with incomplete data.
How does SBA compare to microservices in terms of data ownership?
?
Unlike microservices (where each service owns its isolated data store), SBA uses a shared in-memory data grid replicated across all processing units, with a single shared database as the backend. This tight data coupling makes SBA less aligned with independent team ownership — better for centralized or small team structures.
What does the Processing Grid do in Space-Based Architecture?
?
The Processing Grid (optional component) orchestrates requests that require coordination across multiple specialized processing unit types. For example, an order request may need both a Product PU and an Order PU — the Processing Grid mediates their interaction and assembles the combined response.
Total Cards: 27
Estimated Review Time: 20-25 minutes
Priority: MEDIUM
Last Updated: 2026-05-29