Chapter 12: Design a Digital Wallet

volume2 digital-wallet fintech consistency balance

Status: 🟩 Interview ready
Difficulty: Very Hard
Time to complete: 60 min read + practice

Overview

A digital wallet stores money electronically and lets users transfer funds instantly between each other — think Venmo, Cash App, Google Pay balance, or the PayPal balance feature. Unlike a payment system (which moves money via external card networks), a digital wallet manages internal balances: the money lives inside the platform and moves between ledger entries.

Why this matters:

Common interview question at fintech companies and companies with peer-to-peer features
Forces you to think deeply about distributed transactions, consistency, and failure handling
Teaches 2-Phase Commit, Saga pattern, event sourcing, and CQRS — foundational distributed systems patterns
The hardest constraint: exactly-once transfer (no double debit, no lost money)

Problem Statement

Design a digital wallet service where:

Users can store a monetary balance
Users can transfer money to each other instantly
Transfers are atomic and exactly-once (no double debit, no lost credits)
Balances are always accurate (strong consistency)
System is highly available (99.99%)

Step 1: Requirements & Scope (5 min)

Functional Requirements

Clarifying questions:

Transfer between users? → Yes, peer-to-peer (Venmo-style)
Load money in / withdraw money out? → Yes (but assume the pay-in/pay-out layer exists separately)
Scale? → 1 million transactions per day
Multi-currency? → Yes, handle multiple currencies with conversion
Historical balance? → Yes, users can view transaction history
Is this a standalone wallet or part of a larger payment platform? → Standalone wallet service

Scope:

Transfer money between two user wallets
Query wallet balance
View transaction history
Exactly-once transfer with idempotency
Concurrent transfers handled correctly

Non-Functional Requirements

Exactly-once transfer: No double debit under any failure scenario
Strong consistency: A user’s balance must always reflect all completed transfers
High availability: 99.99% — wallets must be accessible
Durability: No money lost even if servers crash
Auditability: Every balance change immutably recorded
Low latency: < 500ms for transfer confirmation
Read-your-writes: After a transfer, the sender sees their updated balance immediately

Scale Estimates

Transactions per day: 1,000,000
Average TPS: ~12
Peak TPS: ~100

Per transfer: 2 ledger entries (debit + credit)
Ledger entries/day: 2,000,000
Ledger entries/year: ~730 million

Balance snapshot frequency: every 1,000 events per user
Users: 10 million
Active wallets stored in DB: 10 million rows

Step 2: High-Level Design (10 min)

Core Entities

Wallet: Each user has a wallet with a balance per currency
Transfer: Intent to move money (with idempotency key)
Ledger Entry: Immutable record of every balance change

The Distributed Transaction Challenge

The hardest part of a digital wallet is that User A’s wallet and User B’s wallet may live on different database nodes (when data is sharded). Deducting from A and crediting B must be atomic — either both happen or neither happens.

┌─────────────────────────────────────────────────────────────────┐
│       The Distributed Transaction Problem                       │
│                                                                 │
│  User A wallet: DB Node 1 (shard A-M)                          │
│  User B wallet: DB Node 2 (shard N-Z)                          │
│                                                                 │
│  Transfer $50 from A to B:                                      │
│    Step 1: Debit A by $50   → DB Node 1                        │
│    Step 2: Credit B by $50  → DB Node 2                        │
│                                                                 │
│  What if Step 1 succeeds but Step 2 fails?                      │
│    → A lost $50, B never got it — MONEY LOST! ❌               │
│                                                                 │
│  What if we crash between Step 1 and Step 2?                    │
│    → Same problem — partial execution ❌                        │
└─────────────────────────────────────────────────────────────────┘

Three Approaches to Distributed Transactions

We evaluate three progressively sophisticated approaches:

Approach	Consistency	Availability	Complexity	Recommended?
2-Phase Commit (2PC)	Strong	Low (blocking)	Medium	Only for small scale
Saga Pattern	Eventual	High	High	Good for microservices
Event Sourcing + CQRS	Strong (per user)	High	Highest	Best for financial wallets

Basic Architecture

┌──────────────────────────────────────────────────────────┐
│                  Digital Wallet Platform                 │
│                                                          │
│  ┌────────────┐   ┌───────────────┐   ┌───────────────┐  │
│  │  Wallet    │   │   Transfer    │   │    Ledger     │  │
│  │  Service   │   │   Service     │   │    Service    │  │
│  │            │   │ (Orchestrator)│   │ (Append-only) │  │
│  │  - Balance │   │  - Create     │   │  - Events     │  │
│  │    queries │   │    transfers  │   │  - History    │  │
│  │  - Balance │   │  - Idempotent │   │  - Snapshots  │  │
│  │    updates │   │  - State mgmt │   │               │  │
│  └────────────┘   └───────────────┘   └───────────────┘  │
│                                                          │
│  ┌──────────────────────────────────────────────────┐    │
│  │              Event Store (Append-only)           │    │
│  │     All wallet events: CREDITED, DEBITED         │    │
│  └──────────────────────────────────────────────────┘    │
└──────────────────────────────────────────────────────────┘

API Design

Transfer money:

POST /v1/wallet/transfers

Request:
{
  "from_user_id": "user_A",
  "to_user_id": "user_B",
  "amount": 5000,          // In cents
  "currency": "USD",
  "idempotency_key": "550e8400-e29b-41d4-a716-446655440000",
  "note": "Dinner split"
}

Response (202 Accepted):
{
  "transfer_id": "txn_xyz789",
  "status": "PENDING",
  "created_at": "2026-04-13T10:00:00Z"
}

Query balance:

GET /v1/wallet/{user_id}/balance

Response:
{
  "user_id": "user_A",
  "balances": [
    { "currency": "USD", "amount": 15000 },  // $150.00
    { "currency": "EUR", "amount": 8000 }    //  €80.00
  ]
}

Step 3: Deep Dive (25 min)

Approach 1: Distributed Transactions with 2-Phase Commit (2PC)

How it works:

2PC involves a coordinator that orchestrates commit or abort across multiple database nodes (participants).

┌─────────────────────────────────────────────────────────────────┐
│                    2-Phase Commit (2PC)                         │
│                                                                 │
│  Phase 1 — Prepare ("Can you commit?")                          │
│                                                                 │
│  Coordinator → DB Node 1: "Prepare debit A $50 — can you?"     │
│  Coordinator → DB Node 2: "Prepare credit B $50 — can you?"    │
│                                                                 │
│  DB Node 1: Acquires lock, writes to WAL, responds "YES"       │
│  DB Node 2: Acquires lock, writes to WAL, responds "YES"       │
│                                                                 │
│  Phase 2 — Commit (or Abort)                                   │
│                                                                 │
│  If ALL responded YES:                                          │
│    Coordinator → DB Node 1: "COMMIT"                           │
│    Coordinator → DB Node 2: "COMMIT"                           │
│    Both apply changes, release locks                           │
│                                                                 │
│  If ANY responded NO:                                           │
│    Coordinator → DB Node 1: "ABORT"                            │
│    Coordinator → DB Node 2: "ABORT"                            │
│    Both rollback, release locks                                 │
└─────────────────────────────────────────────────────────────────┘

Failure modes of 2PC:

Problem 1: Coordinator failure after Phase 1
  - Nodes are locked waiting for Phase 2 decision
  - Resources blocked indefinitely until coordinator recovers
  - This is the "blocking" problem of 2PC

Problem 2: Network partition
  - Node 1 committed, Node 2 never got the commit message
  - Inconsistent state: A debited, B not credited

Problem 3: Both nodes down
  - If a node crashes between Prepare and Commit,
    on recovery it checks its WAL and waits for coordinator
  - If coordinator is also down, the node is stuck

When 2PC is acceptable:

Both shards in the same datacenter (low latency, low partition risk)
Single relational DB with distributed transactions (CockroachDB, Google Spanner)
Scale is small enough that blocking is not a concern
Example: PostgreSQL with cross-shard transactions using a distributed DB

2PC summary:

Attribute	Value
Consistency	Strong
Availability	Low (blocking on coordinator failure)
Latency	Multiple round trips
Scalability	Poor (locks resources across nodes)
Best for	Small scale, same-datacenter deployments

Approach 2: Saga Pattern (Preferred for Microservices)

The idea: Break the distributed transaction into a sequence of local transactions. Each step publishes an event. If any step fails, execute compensating transactions to undo previous steps.

Two flavors of Saga:

Choreography-based Saga (Event-driven, no central coordinator)

┌──────────────────────────────────────────────────────────────┐
│          Choreography Saga: A pays B $50                     │
│                                                              │
│  1. Transfer Service creates PENDING transfer                 │
│     Publishes event: TransferInitiated{A→B, $50}             │
│               ↓                                              │
│  2. Wallet Service for A listens:                            │
│     Debit A $50, publishes: MoneyDebited{A, $50, transfer_id}│
│               ↓                                              │
│  3. Wallet Service for B listens:                            │
│     Credit B $50, publishes: MoneyCredited{B, $50, transfer_id}│
│               ↓                                              │
│  4. Transfer Service listens:                                │
│     Marks transfer SUCCESS                                   │
│                                                              │
│  On failure at step 3 (Credit B fails):                      │
│  Publishes: CreditFailed{B, $50, transfer_id}                │
│               ↓                                              │
│  Wallet Service for A listens:                               │
│  Compensating transaction: Credit A back $50                 │
│  (Reversal entry in ledger)                                  │
└──────────────────────────────────────────────────────────────┘

Pros: No central coordinator, each service is fully independent, highly available.
Cons: Hard to debug (events scattered across services), no easy global view of saga state, complex error paths.

Orchestration-based Saga (Central orchestrator manages steps)

┌──────────────────────────────────────────────────────────────┐
│          Orchestration Saga: A pays B $50                    │
│                                                              │
│          ┌─────────────────────────────────────┐            │
│          │      Transfer Saga Orchestrator      │            │
│          └───────────────┬─────────────────────┘            │
│                          │                                   │
│    Step 1: Debit A       │        Step 2: Credit B           │
│    ┌─────────┐ ←──────── │ ──────────→ ┌─────────┐          │
│    │Wallet A │           │             │Wallet B │          │
│    └─────────┘           │             └─────────┘          │
│         OK               │                  OK              │
│    ──────────→           │             ←──────────          │
│                          │                                   │
│    Step 3: Mark SUCCESS  │                                   │
│    ┌─────────────┐ ←──── │                                   │
│    │ Transfer DB │       │                                   │
│    └─────────────┘       │                                   │
│                                                              │
│    If Step 2 (Credit B) fails:                               │
│    Orchestrator calls COMPENSATE on Step 1:                  │
│      → Credit A back $50 (compensating transaction)         │
│      → Mark transfer FAILED                                  │
└──────────────────────────────────────────────────────────────┘

Pros: Central visibility of saga state, easier to debug, explicit compensation logic.
Cons: Orchestrator is a single point of logic (not of failure if persisted), can become a bottleneck.

Compensating transactions — key concept:

In Saga, you cannot "rollback" across service boundaries.
Instead, you write a compensating transaction:
  - Debit A succeeded → Compensating: Credit A back
  - Credit B failed → No compensation needed (it never applied)

Compensating transactions are:
  - NEW ledger entries (not rollbacks)
  - Idempotent (safe to retry)
  - Must always succeed (can't fail a compensation!)

Saga summary:

Attribute	Value
Consistency	Eventual (brief window of inconsistency during saga)
Availability	High (no distributed locks)
Latency	Low (local transactions only)
Scalability	Excellent
Best for	Microservices across different databases

Approach 3: Event Sourcing + CQRS (Recommended for Financial Wallets)

Core idea: Never store the current balance directly. Instead, store every event that affected the balance (append-only). Balance = replay of all events. Use CQRS to separate writes (events) from reads (materialized balance view).

┌──────────────────────────────────────────────────────────────┐
│              Event Sourcing for Wallet                       │
│                                                              │
│  Traditional approach (WRONG for financial):                 │
│  UPDATE wallets SET balance = balance - 50 WHERE user='A'    │
│  → Overwrites history, hard to audit                         │
│                                                              │
│  Event Sourcing approach (CORRECT):                          │
│  INSERT INTO events (user_id, type, amount, transfer_id)     │
│             VALUES ('A', 'DEBITED', -50, 'txn_xyz')          │
│             VALUES ('B', 'CREDITED', +50, 'txn_xyz')         │
│  → Never update, only append                                 │
│  → Balance = SUM of all events for that user                 │
└──────────────────────────────────────────────────────────────┘

Event store schema:

CREATE TABLE wallet_events (
    event_id      BIGINT PRIMARY KEY AUTO_INCREMENT,
    user_id       VARCHAR(255) NOT NULL,
    event_type    ENUM('CREDITED', 'DEBITED') NOT NULL,
    amount        BIGINT NOT NULL,          -- Always positive
    direction     ENUM('IN', 'OUT') NOT NULL,
    transfer_id   VARCHAR(255) NOT NULL,
    currency      CHAR(3) NOT NULL,
    created_at    TIMESTAMP DEFAULT NOW(),
    -- NO updated_at, NO deletes
    INDEX idx_user_id (user_id, created_at)
);

Computing balance from events:

-- Balance for user A in USD:
SELECT SUM(CASE WHEN direction = 'IN' THEN amount
                WHEN direction = 'OUT' THEN -amount END)
  FROM wallet_events
 WHERE user_id = 'A'
   AND currency = 'USD';

Problem: Computing balance by replaying ALL events is O(n) and gets slower as history grows.

Solution: Snapshots

┌──────────────────────────────────────────────────────────────┐
│                  Snapshot Pattern                            │
│                                                              │
│  Every N events (e.g., 1000), create a balance snapshot:    │
│                                                              │
│  snapshot_at_event_5000: { balance: 15000, currency: USD }  │
│                                                              │
│  Current balance = snapshot_balance + SUM(events after 5000)│
│                                                              │
│  Instead of replaying 10,000 events,                        │
│  replay only the last ~1000 since the snapshot.             │
│                                                              │
│  Snapshot table:                                             │
│  CREATE TABLE wallet_snapshots (                             │
│      user_id       VARCHAR(255),                            │
│      currency      CHAR(3),                                 │
│      balance       BIGINT,                                  │
│      last_event_id BIGINT,  -- Snapshot covers events ≤ this│
│      created_at    TIMESTAMP,                               │
│      PRIMARY KEY (user_id, currency, last_event_id)         │
│  );                                                         │
└──────────────────────────────────────────────────────────────┘

CQRS — Command Query Responsibility Segregation:

┌──────────────────────────────────────────────────────────────┐
│                   CQRS Architecture                          │
│                                                              │
│  COMMAND SIDE (Writes)                                       │
│  ┌──────────────────┐                                        │
│  │  Transfer API    │  ← POST /transfers                     │
│  └────────┬─────────┘                                        │
│           │ Validate + write events                          │
│           ↓                                                  │
│  ┌──────────────────┐                                        │
│  │   Event Store    │  (append-only, single source of truth) │
│  └────────┬─────────┘                                        │
│           │ Publish events                                   │
│           ↓                                                  │
│  ┌──────────────────┐                                        │
│  │ Event Processor  │  (async, builds read models)           │
│  └────────┬─────────┘                                        │
│           │                                                  │
│  QUERY SIDE (Reads)                                          │
│           ↓                                                  │
│  ┌──────────────────┐                                        │
│  │  Balance View DB │  ← Materialized current balances       │
│  │ (pre-computed)   │  ← Updated by event processor         │
│  └──────────────────┘                                        │
│           ↑                                                  │
│  ┌──────────────────┐                                        │
│  │  Balance API     │  ← GET /wallet/balance                 │
│  └──────────────────┘                                        │
└──────────────────────────────────────────────────────────────┘

Why CQRS for wallets:

Write path: append event to immutable store (fast, simple)
Read path: query pre-computed balance from materialized view (fast O(1) lookup)
Event processor rebuilds read model if it gets corrupted — replay all events
Read and write paths can scale independently

Event Sourcing + CQRS summary:

Attribute	Value
Consistency	Strong per-user (events ordered), eventual across users
Availability	High (writes go to append-only store)
Auditability	Perfect (full history always available)
Complexity	Highest
Scalability	Excellent (shard event store by user_id)
Best for	Financial wallets requiring audit trail

Deep Dive: Distributed Locking for Concurrent Transfers

Problem: Two concurrent transfers from User A’s wallet:

Transfer 1: A → B $50 (A has $100)
Transfer 2: A → C $80 (A has $100)

Both read balance: $100
Both check: $100 > $50 ✅ and $100 > $80 ✅
Both debit A:
  A's balance: $100 - $50 = $50 (Transfer 1)
  A's balance: $100 - $80 = $20 (Transfer 2)
  Actual A balance: $20 (double spent $130 from a $100 wallet!) ❌

Solution 1: Database-level locking (SELECT FOR UPDATE)

BEGIN;
SELECT balance FROM wallets WHERE user_id = 'A' FOR UPDATE;
-- Only this transaction holds the lock
-- Update, check, debit
UPDATE wallets SET balance = balance - 50 WHERE user_id = 'A';
COMMIT;
-- Lock released, next transaction proceeds

Solution 2: Optimistic locking (version number)

-- Read with version
SELECT balance, version FROM wallets WHERE user_id = 'A';
-- balance=100, version=5
 
-- Update only if version hasn't changed
UPDATE wallets SET balance = 50, version = 6
 WHERE user_id = 'A' AND version = 5;  -- Compare-and-swap
 
-- If 0 rows updated: conflict detected, retry

Solution 3: Redis distributed lock (for cross-service locking)

-- Acquire lock on user A's wallet
SET lock:wallet:A <random_value> NX EX 5
-- NX: only set if not exists (atomic)
-- EX 5: auto-release in 5 seconds (prevent deadlock)

-- If acquired: proceed with transfer
-- If not acquired: retry after backoff

-- Release lock (only if we own it)
if redis.get("lock:wallet:A") == random_value:
    redis.delete("lock:wallet:A")

Recommendation: Use database-level locking (SELECT FOR UPDATE) for single-node wallets. Use Redis distributed lock or optimistic locking for cross-service scenarios. With event sourcing, use optimistic concurrency (version check on event_id) to prevent concurrent writes.

Deep Dive: Consistency Guarantees

Read-your-writes (critical for UX):

Problem:
  User A transfers $50 → write goes to primary DB
  User A checks balance → read goes to replica (lag!)
  User A sees OLD balance → confusing UX

Solution: Route each user's reads to the primary after writes
  - After a write, client stores "last_write_timestamp"
  - Read requests include "min_freshness": last_write_timestamp
  - Read routed to primary if replica lag > threshold
  - Or: always route same user's requests to same node (sticky reads)

Monotonic reads:

Problem:
  User B receives $50 transfer
  Request 1 → replica 1: balance = $150 (has the credit)
  Request 2 → replica 2: balance = $100 (hasn't caught up yet)
  Balance goes backward!

Solution: Route reads to same replica per session
  Or: use primary for balance queries

Complete Transfer Flow (Event Sourcing approach)

┌──────────────────────────────────────────────────────────────────┐
│         Transfer $50 from User A to User B                       │
│                                                                  │
│  1. POST /v1/wallet/transfers                                     │
│     {from: A, to: B, amount: 50, idempotency_key: uuid}          │
│               ↓                                                  │
│  2. Transfer Service: Check idempotency key (already done?)       │
│     → New → Continue                                             │
│               ↓                                                  │
│  3. Transfer Service: Check A's balance ≥ $50                    │
│     (Query materialized balance view — fast O(1))                │
│               ↓                                                  │
│  4. Transfer Service: Write to Event Store (ATOMIC):             │
│     a. INSERT event: A DEBITED $50, transfer_id=xyz              │
│     b. INSERT event: B CREDITED $50, transfer_id=xyz             │
│     c. INSERT transfer record: status=SUCCESS                    │
│     (All in ONE database transaction on the event store)         │
│               ↓                                                  │
│  5. Transfer Service: Store idempotency_key with result          │
│               ↓                                                  │
│  6. Return 200: { transfer_id: xyz, status: SUCCESS }            │
│               ↓                                                  │
│  7. (Async) Event Processor reads new events:                    │
│     → Updates materialized balance view:                         │
│        A's balance: 10000 - 5000 = 5000                          │
│        B's balance: 8000 + 5000 = 13000                          │
│               ↓                                                  │
│  8. (Async) Notification Service: Notify A and B via push/email  │
└──────────────────────────────────────────────────────────────────┘

Key insight: Steps 4a, 4b, 4c happen in ONE atomic DB transaction on the event store. Because both events are on the same DB (even if users are on different shards, the event store can co-locate them by transfer_id), this is a LOCAL transaction — no distributed transaction needed.

Design Summary

Final Architecture

┌───────────────────────────────────────────────────────────────┐
│                   Digital Wallet Platform                     │
│                                                               │
│  Client ──→ API Gateway ──→ Transfer Service                  │
│                                  │                            │
│                    ┌─────────────┼──────────────┐             │
│                    ↓             ↓              ↓             │
│              ┌──────────┐  ┌──────────┐  ┌──────────────┐    │
│              │Idempotency│ │ Balance  │  │  Event Store │    │
│              │  Cache   │  │ Check DB │  │ (append-only)│    │
│              │ (Redis)  │  │(read mdl)│  │              │    │
│              └──────────┘  └──────────┘  └──────┬───────┘    │
│                                                  │            │
│                                    ┌─────────────▼──────┐     │
│                                    │  Event Processor   │     │
│                                    │  (async consumer)  │     │
│                                    └─────────────┬──────┘     │
│                                                  │            │
│                             ┌────────────────────▼──────┐     │
│                             │  Materialized Balance View │     │
│                             │  (Pre-computed per user)   │     │
│                             └───────────────────────────┘     │
└───────────────────────────────────────────────────────────────┘

Key Decisions Summary

Decision	Choice	Reasoning
Transaction approach	Event sourcing + CQRS	Auditability, replay, no distributed lock needed
Avoiding 2PC	Local event store transaction	Both A’s debit and B’s credit in one DB transaction
Concurrency	Optimistic locking on event_id	No blocking locks, retry on conflict
Read performance	Materialized balance view	O(1) balance query, not O(n) event replay
Performance at scale	Snapshots every 1,000 events	Bounds replay cost without losing history
Distributed locking	Redis SET NX (only if needed)	Fast, auto-expiring, prevents double debit
Idempotency	UUID per transfer, stored in cache	Prevents double transfer on retry
Consistency	Read-your-writes via primary routing	User always sees own latest balance

Interview Questions & Answers

Q: Why is 2PC not recommended for a digital wallet at scale?
A: 2PC is a blocking protocol. After Phase 1 (Prepare), all database nodes hold locks and wait for Phase 2 (Commit/Abort). If the coordinator crashes after Phase 1, all nodes are blocked indefinitely until the coordinator recovers. In a high-scale system, this blocks thousands of concurrent transfers. Also, 2PC has multiple network round-trips per transaction, adding latency. For a wallet at scale, prefer event sourcing where both debit and credit events are written in a single local DB transaction, completely avoiding cross-shard coordination.

Q: Explain Saga pattern and compensating transactions.
A: A Saga breaks a distributed transaction into a sequence of local transactions, each publishing an event on success. If any step fails, the saga executes compensating transactions on all previously completed steps. For a wallet transfer: Step 1 debits A (local transaction), Step 2 credits B (local transaction). If Step 2 fails, a compensating transaction credits A back. Compensating transactions are new ledger entries (not rollbacks) and must always succeed. The Saga pattern avoids distributed locks but introduces brief eventual consistency.

Q: What is event sourcing and why is it ideal for financial systems?
A: Event sourcing stores every state change as an immutable event in an append-only log, rather than storing current state. Balance = sum/replay of all events. Ideal for financial systems because: (1) complete audit trail by design — every cent is always traceable, (2) time travel — reconstruct state at any past point by replaying events up to that time, (3) no destructive updates — regulatory compliance requirement, (4) event replay allows rebuilding any derived view (balance, history, analytics) if it gets corrupted. The trade-off is increased complexity and the need for snapshots to keep query performance acceptable.

Q: What is CQRS and how does it pair with event sourcing?
A: CQRS (Command Query Responsibility Segregation) separates the write model (commands that append events) from the read model (queries that read pre-computed views). With event sourcing: the command side appends events to the event store; an async event processor builds and updates materialized views (e.g., current balance table) from those events; the query side reads from the fast materialized view without touching the event store. Benefits: write and read paths scale independently, read queries are O(1) not O(n), event store is never burdened with ad-hoc read queries.

Q: How do you ensure strong consistency with event sourcing in a digital wallet?
A: Strong consistency per-user is achieved by using optimistic concurrency: each event has a sequence number, and a new event is only accepted if its expected_sequence = last_event_sequence + 1 (compare-and-swap). This serializes events per user. Cross-user consistency (e.g., A’s debit and B’s credit are atomic) is achieved by writing both events in a single local DB transaction keyed by transfer_id — not a distributed transaction. Read-your-writes is ensured by routing the sender’s balance query to the primary DB or using a consistency token.

Key Takeaways

Exactly-once transfer requires idempotency keys (UUID per transfer, reused on retry) — same principle as payments.
2PC is blocking — if the coordinator fails after Prepare, all nodes are locked. Avoid at scale.
Saga pattern avoids distributed locks by using local transactions and compensating transactions for rollback, but introduces eventual consistency.
Event sourcing is the gold standard for financial wallets — append-only log, full audit trail, time travel, no destructive updates.
CQRS pairs with event sourcing: command side writes events (slow-to-write, durable), query side reads pre-computed materialized views (fast O(1) reads).
Snapshots make event sourcing practical — without them, replaying 10 years of events per balance query is infeasible.
Distributed lock (Redis SET NX) is the escape hatch for concurrent transfers when optimistic locking would cause too many retries.

ch11-payment-system - Payment system (pay-in/pay-out via PSP)
distributed-system-components - Distributed databases, event stores, Redis
key-patterns > Event Sourcing - Event sourcing pattern deep dive
key-patterns > CQRS - Command Query Responsibility Segregation
ch05-consistent-hashing - Sharding wallets across DB nodes

Practice this design! Know all three transaction approaches and their trade-offs. Be ready to:

Draw the event sourcing + CQRS architecture
Explain why 2PC blocks and when to avoid it
Walk through a Saga with a compensating transaction
Explain snapshot pattern for event sourcing performance
Compare this design to the payment system design and articulate the key differences

Last Updated: 2026-04-13
Status: Very common in fintech interviews — Must know!

Study Notes by Niladri & AI

Explorer

ch12-digital-wallet

Chapter 12: Design a Digital Wallet

Overview

Problem Statement

Step 1: Requirements & Scope (5 min)

Functional Requirements

Non-Functional Requirements

Scale Estimates

Step 2: High-Level Design (10 min)

Core Entities

The Distributed Transaction Challenge

Three Approaches to Distributed Transactions

Basic Architecture

API Design

Step 3: Deep Dive (25 min)

Approach 1: Distributed Transactions with 2-Phase Commit (2PC)

Approach 2: Saga Pattern (Preferred for Microservices)

Choreography-based Saga (Event-driven, no central coordinator)

Orchestration-based Saga (Central orchestrator manages steps)

Approach 3: Event Sourcing + CQRS (Recommended for Financial Wallets)

Deep Dive: Distributed Locking for Concurrent Transfers

Deep Dive: Consistency Guarantees

Complete Transfer Flow (Event Sourcing approach)

Design Summary

Final Architecture

Key Decisions Summary

Interview Questions & Answers

Key Takeaways

Graph View

Table of Contents

Backlinks

Study Notes by Niladri & AI

Explorer

ch12-digital-wallet

Chapter 12: Design a Digital Wallet

Overview

Problem Statement

Step 1: Requirements & Scope (5 min)

Functional Requirements

Non-Functional Requirements

Scale Estimates

Step 2: High-Level Design (10 min)

Core Entities

The Distributed Transaction Challenge

Three Approaches to Distributed Transactions

Basic Architecture

API Design

Step 3: Deep Dive (25 min)

Approach 1: Distributed Transactions with 2-Phase Commit (2PC)

Approach 2: Saga Pattern (Preferred for Microservices)

Choreography-based Saga (Event-driven, no central coordinator)

Orchestration-based Saga (Central orchestrator manages steps)

Approach 3: Event Sourcing + CQRS (Recommended for Financial Wallets)

Deep Dive: Distributed Locking for Concurrent Transfers

Deep Dive: Consistency Guarantees

Complete Transfer Flow (Event Sourcing approach)

Design Summary

Final Architecture

Key Decisions Summary

Interview Questions & Answers

Key Takeaways

Related Resources

Graph View

Table of Contents

Backlinks