Chapter 13: Design a Stock Exchange

volume2 stock-exchange trading matching-engine low-latency

Status: 🟩 Interview ready - The most complex chapter!
Difficulty: Extreme
Time to complete: 60 min read + practice


Overview

A stock exchange is an electronic marketplace where buyers and sellers trade financial securities (stocks, options, futures). Major exchanges include:

  • NYSE (New York Stock Exchange) β€” largest by market cap
  • NASDAQ β€” tech-heavy, fully electronic
  • CBOE (Chicago Board Options Exchange) β€” options-focused

Core job: Match buy and sell orders, execute trades, and report prices in real-time.

Why this matters:

  • Most technically demanding system design question β€” tests everything
  • Extreme non-functional requirements (sub-millisecond latency, strict ordering)
  • Covers event sourcing, low-latency architectures, and financial systems
  • Demonstrates mastery of distributed systems under ultra-strict guarantees

Problem Statement

Design a stock exchange system that:

  • Matches buy and sell orders for stocks with microsecond-to-millisecond latency
  • Guarantees strict price-time priority for fair trade execution
  • Provides real-time market data (price feeds) to all participants
  • Maintains a complete, immutable audit trail for regulatory compliance
  • Handles circuit breakers and risk management to ensure market stability

Step 1: Requirements & Scope (5 min)

Functional Requirements

Clarifying questions:

  • What order types should we support? β†’ Market, Limit, Stop orders
  • What is the trading window? β†’ 9:30 AM – 4:00 PM EST (market hours)
  • Do we need pre-market / after-hours? β†’ Out of scope for now
  • What is the matching priority? β†’ Price-time priority (FIFO at same price level)
  • Do we need audit trail? β†’ Yes, regulatory requirement (SEC, FINRA)
  • What instruments? β†’ Equities (stocks) only for now

Scope:

  • Accept and validate incoming orders (market, limit, stop)
  • Match orders using price-time priority
  • Publish trade executions and market data in real-time
  • Maintain complete audit trail for all orders and trades
  • Risk management: circuit breakers, position limits, pre-trade risk checks

Non-Functional Requirements

  • Ultra-low latency: Order matching P99 < 1 millisecond (microseconds in practice)
  • High throughput: 1 billion orders/day (~100,000 orders/second peak)
  • Strict ordering: Every order has a unique, monotonically increasing sequence number
  • Fault tolerance: No lost orders; recovery must replay exactly same sequence
  • Fairness: Same price β†’ older order executes first (no favoritism)
  • Availability: 99.99% during market hours
  • Regulatory compliance: Full audit trail, immutable event log

Step 2: High-Level Design (10 min)

Order Lifecycle

[Trading Client]
      |
      | (FIX Protocol / WebSocket)
      v
[Client Gateway]          ← Validates format, authenticates client
      |
      v
[Order Manager]           ← Validates risk (pre-trade), routes to engine
      |
      v
[Sequencer]               ← Assigns monotonically increasing sequence number
      |
      v
[Matching Engine]         ← Core: Matches buy/sell orders in order book
      |              |
      v              v
[Trade Reporter]  [Order Book Updates]
      |                   |
      v                   v
[Market Data          [Client Gateway]  ← Send fills/rejects back to client
 Publisher]
      |
      v
[Market Data Feed]        ← Broadcasts prices/trades to all subscribers

Core Components

ComponentRoleKey Concern
Client GatewayOrder ingestion, FIX protocol parsing, client authProtocol handling
Order ManagerOrder validation, pre-trade risk checks, routingRisk management
SequencerAssign strict monotonic sequence numbers to all messagesStrict ordering
Matching EngineMaintains order book, matches buy/sell ordersCorrectness + latency
Market Data PublisherBroadcasts price/volume data to market participantsFan-out throughput
Trade ReporterSends execution reports back to clientsReliable delivery
Risk EnginePosition limits, circuit breakers, margin calculationsMarket safety

Protocol: FIX (Financial Information eXchange)

FIX Message Example (New Order):
8=FIX.4.4          (Protocol version)
35=D               (Message type: New Order Single)
49=CLIENT1         (Sender ID)
56=EXCHANGE        (Target ID)
11=ORDER-001       (Client Order ID)
55=AAPL            (Symbol)
54=1               (Side: 1=Buy, 2=Sell)
38=100             (Quantity)
40=2               (Order type: 2=Limit)
44=150.00          (Limit price)

FIX is the universal protocol for institutional trading β€” standardized, battle-tested, used by every major exchange.


Step 3: Deep Dive (30 min)

Order Book

The order book is the central data structure of the matching engine. It holds all outstanding (unmatched) orders for a given symbol.

Structure:

Symbol: AAPL
                  BIDS (Buy side)              ASKS (Sell side)
Price   Qty   Orders    |    Price   Qty   Orders
------  ----  ------    |    ------  ----  ------
$150.05  300  [O1, O2]  |   $150.10  200  [O5]
$150.00  500  [O3, O4]  |   $150.15  400  [O6, O7]
$149.95  200  [O8]      |   $150.20  100  [O9]

Best Bid: $150.05     Best Ask: $150.10
Spread = $150.10 - $150.05 = $0.05

Sorting rules:

  • Bid side (buy orders): Sorted by price descending (highest price first β€” most eager buyer)
  • Ask side (sell orders): Sorted by price ascending (lowest price first β€” most eager seller)
  • At same price level: FIFO β€” earlier order (lower sequence number) executes first

Data structure choice:

Order Book Implementation:
- bids: TreeMap<Long, LinkedList<Order>>   (price β†’ queue of orders, descending)
- asks: TreeMap<Long, LinkedList<Order>>   (price β†’ queue of orders, ascending)
- orderIndex: HashMap<String, Order>       (orderId β†’ Order, for O(1) cancel/modify)

Key operations:
- Add order:    O(log N) to find price level, O(1) to append to queue
- Cancel order: O(1) using orderIndex, O(1) to remove from doubly-linked list
- Match:        O(1) to get best bid/ask from TreeMap head, O(1) to dequeue

Why doubly-linked list at each price level?

  • O(1) insertion at tail (new order)
  • O(1) removal from any position (cancel order β€” using pointer from orderIndex)
  • O(1) removal from head (trade execution β€” FIFO)

Matching Algorithm: Price-Time Priority

Algorithm (runs on every new order):

function matchOrder(incomingOrder):
    if incomingOrder.side == BUY:
        while incomingOrder.qty > 0 and asks is not empty:
            bestAsk = asks.firstEntry()           // Lowest ask price
            if incomingOrder.type == MARKET or
               incomingOrder.limitPrice >= bestAsk.price:
                fill = min(incomingOrder.qty, bestAsk.orders.peek().qty)
                executeTrade(incomingOrder, bestAsk.orders.peek(), fill)
                incomingOrder.qty -= fill
                bestAsk.orders.peek().qty -= fill
                if bestAsk.orders.peek().qty == 0:
                    bestAsk.orders.poll()        // Remove fully filled order
                if bestAsk.orders.isEmpty():
                    asks.remove(bestAsk.price)   // Remove empty price level
            else:
                break                            // No match possible
        if incomingOrder.qty > 0 and incomingOrder.type == LIMIT:
            addToOrderBook(incomingOrder)        // Rest on book
    // Mirror logic for SELL side

function executeTrade(buyer, seller, qty):
    price = seller.price                         // Resting order's price wins
    trade = Trade(buyer.id, seller.id, price, qty, timestamp)
    publishTrade(trade)
    sendExecutionReport(buyer, FILL)
    sendExecutionReport(seller, FILL)

Order Types

TypeDescriptionExecutionRisk
MarketExecute immediately at best available priceImmediate, guaranteed fillSlippage β€” may get terrible price in thin market
LimitExecute only at specified price or betterMay rest on book if no matchNo slippage, but may not fill
StopTrigger market/limit order when price hits trigger levelConverts to market/limit on triggerGap risk on open

Limit order resting on book (most common):

Limit Buy 100 AAPL @ $150.00
β†’ Check if any asks <= $150.00
β†’ No match found
β†’ Order rests on bid side at $150.00
β†’ Will execute when seller hits $150.00 or below

Sequencer: The Ordering Heart of the Exchange

The sequencer is arguably the most critical architectural component. It solves the problem: β€œHow do you guarantee strict ordering when multiple gateways receive orders concurrently?”

Problem without sequencer:

Gateway 1 receives Order A at t=10:00:00.000001
Gateway 2 receives Order B at t=10:00:00.000002
Both sent to matching engine simultaneously
β†’ Race condition: which order is "first"?
β†’ Non-deterministic matching = unfair market

Sequencer solution:

All incoming messages β†’ Sequencer (single point)
                           |
                    Assigns sequence number
                    (monotonically increasing: 1, 2, 3, ...)
                    Writes to persistent event log
                           |
                    Broadcasts to matching engine
                    (matching engine processes in strict seq order)

Properties:

  • Single sequencer = single source of truth for ordering
  • Sequence number persisted to durable log before processing
  • Matching engine consumes messages in strict sequence order
  • On recovery: replay from last checkpoint by re-reading event log

Implementation: Ring buffer (like LMAX Disruptor) in front of matching engine β€” lock-free, cache-friendly, microsecond-latency message passing.

Sequencer Event Log Entry:
{
  seqNum: 1000042,
  timestamp: 1713000000000001,  // Nanoseconds
  type: "NEW_ORDER",
  orderId: "CLIENT1-00001",
  symbol: "AAPL",
  side: "BUY",
  type: "LIMIT",
  price: 15000,                 // Integer cents to avoid floating point
  qty: 100
}

Low-Latency Architecture Techniques

The matching engine must operate at microsecond latency. Standard software approaches are too slow.

1. LMAX Disruptor (Ring Buffer)

Traditional queue (with locks):
  Producer β†’ [Lock] β†’ Queue β†’ [Lock] β†’ Consumer
  Latency: ~1ms (lock contention, GC pauses)

LMAX Disruptor (lock-free ring buffer):
  Producer β†’ [Ring Buffer] β†’ Consumer
  Latency: ~100 nanoseconds

Ring Buffer:
  - Fixed-size array (power of 2, e.g., 1 million slots)
  - Producer writes to next slot using atomic increment of sequence number
  - Consumer reads from current sequence position
  - No locks, no GC pressure (pre-allocated), cache-friendly (sequential access)

Why it works: CPU caches love sequential memory access. Ring buffer = no pointer chasing, no heap allocation, no GC.

2. Memory-Mapped Files (mmap)

Traditional file I/O:
  Write to file β†’ Kernel buffer β†’ Disk
  Read from file β†’ Kernel buffer β†’ User space copy
  Overhead: 2 memory copies, syscall

Memory-mapped file:
  File mapped directly into process address space
  Write to memory = write to file (OS handles persistence)
  No syscall overhead, no data copy
  Perfect for event log (sequencer persistence)

3. Kernel Bypass Networking (DPDK / RDMA)

Normal networking:
  NIC β†’ Kernel (interrupt) β†’ Socket buffer β†’ User space copy β†’ Application
  Latency: ~50 microseconds

Kernel bypass (DPDK):
  NIC β†’ User space directly (polling mode driver)
  Application polls NIC directly, no kernel involvement
  Latency: ~1-5 microseconds

Used by HFT firms and some exchanges for the β€œlast mile” of latency.

4. Co-location

Trading firms pay exchange data centers to place their servers physically inside the exchange data center β€” sometimes in the same rack as matching engines. Speed-of-light matters at sub-microsecond scale.

5. CPU and Memory Optimizations

- CPU pinning: Bind matching engine thread to dedicated CPU core
  (no context switches, no OS scheduler interference)
- NUMA awareness: Matching engine memory on same NUMA node as CPU
  (cross-NUMA access adds ~100ns)
- Huge pages: Reduce TLB misses (2MB pages vs 4KB pages)
- Disable hyper-threading: Reduces cache contention on matching engine core
- Avoid GC: Use C++/Java with GC paused, or off-heap memory allocation

Latency Budget Breakdown

Component               Target Latency
─────────────────────────────────────────
NIC β†’ Application       1-5 ΞΌs   (with DPDK)
Order validation        ~1 ΞΌs
Sequencer write         ~5 ΞΌs    (mmap to SSD)
Ring buffer enqueue     ~100 ns
Matching engine         ~1-10 ΞΌs
Execution report        ~5 ΞΌs

Total P99 latency:      < 100 ΞΌs (top exchanges achieve 10-50 ΞΌs)

Event Sourcing for Audit Trail

Why event sourcing?

Regulatory requirement: every order, modification, cancellation, and trade must be preserved permanently and exactly.

Event Sourcing Model:
- Every state change = an immutable event appended to the log
- Current state = replay of all past events
- Events are NEVER deleted or modified (append-only)

Events:
1. ORDER_RECEIVED  { orderId, symbol, side, type, price, qty, timestamp }
2. ORDER_VALIDATED { orderId, timestamp }
3. ORDER_SEQUENCED { orderId, seqNum, timestamp }
4. ORDER_MATCHED   { orderId, tradeId, fillPrice, fillQty, timestamp }
5. ORDER_CANCELED  { orderId, timestamp, reason }
6. TRADE_EXECUTED  { tradeId, buyOrderId, sellOrderId, price, qty, timestamp }

Benefits:

  • Audit trail: Complete history of every order β€” required by SEC/FINRA
  • Replay: Can reconstruct exact order book state at any past point in time
  • Debugging: Reproduce any bug by replaying events to that moment
  • Recovery: Restart matching engine by replaying events from last checkpoint

Implementation:

Event store options:
- Custom append-only log (exchange-grade, used in practice)
- Apache Kafka (distributed, durable, replayable)
- PostgreSQL with append-only inserts (for smaller scale)

Checkpoint strategy:
- Take order book snapshot every N events (e.g., every 100,000 events)
- On recovery: load latest snapshot + replay events since checkpoint
- Recovery time = time to load snapshot + time to replay delta

Risk Management

Pre-Trade Risk Checks (Before Matching)

Order received β†’ Risk Engine β†’ Approve/Reject β†’ Sequencer β†’ Matching Engine

Checks:
1. Position limit: Does this trade exceed max position for this client?
   e.g., "Client cannot hold more than 10,000 shares of AAPL"
2. Credit limit: Does client have enough buying power?
   e.g., "Client has $500K credit; this order costs $600K" β†’ REJECT
3. Order size limit: Is this order suspiciously large?
   e.g., "Single order > 10% of ADV (average daily volume)" β†’ flag
4. Symbol validation: Is AAPL a valid, tradable symbol right now?
5. Price validation: Is $1,500,000 for AAPL a typo or fat finger?
   Fat finger check: price within X% of current market price

Circuit Breaker (Market-Wide Stability)

Level 1: S&P 500 drops 7%   β†’ Halt trading for 15 minutes
Level 2: S&P 500 drops 13%  β†’ Halt trading for 15 minutes
Level 3: S&P 500 drops 20%  β†’ Halt trading for remainder of day

Single stock circuit breaker:
- Price moves > 5% in 5 minutes β†’ Limit Up-Limit Down (LULD) pause
- LULD: No trades outside price band (ref price Β± 5%)

Implementation:

Circuit breaker monitors:
- Price movement: Last trade price vs reference price (5-minute average)
- Volume: Unusual volume spike (> 10x normal)
- Order book imbalance: All orders one-sided

On trigger:
- Stop accepting new orders (but allow cancels)
- Broadcast halt message to all participants
- After cool-down period, reopen with call auction

Post-Trade Risk

After each trade execution:
- Update client's real-time position and P&L
- Update margin requirements
- Alert risk team if position nears limits
- Report large trades (> threshold) to regulators in real-time

Market Data Feed

Two types of data published:

1. Full Order Book (L2 data):
   - All price levels with total quantity at each level
   - Incremental updates (add order, remove order, execute trade)
   - Subscribers: Professional traders, algorithmic trading systems

2. Top of Book (L1 data):
   - Best bid price, best bid qty
   - Best ask price, best ask qty
   - Last trade price and qty
   - Subscribers: Retail trading apps, price display boards

Publication mechanism:
- Multicast UDP: Single packet β†’ all subscribers simultaneously
  (Lower latency than TCP unicast to each subscriber)
- Sequence numbers on market data messages (detect gaps)
- Gap recovery: Request retransmission or use snapshot feed

Market data publishing pipeline:

Matching Engine
     |
     | (trade executed / order book changed)
     v
Market Data Handler
     |
     +β†’ Snapshot feed (periodic full book snapshot)
     +β†’ Incremental feed (real-time updates via multicast UDP)
     |
     v
[Broker A]  [Broker B]  [HFT Firm]  [Data Vendor (Bloomberg)]

Recovery Strategy

Failure scenario: Matching engine crashes mid-session

Recovery steps:
1. Load latest order book snapshot (taken every N events)
2. Read event log from snapshot sequence number forward
3. Replay all events in strict sequence order
4. Order book is reconstructed to exact pre-crash state
5. Resume processing new orders from where we left off

Critical guarantee:
- Same events replayed in same order β†’ identical order book state
- This is why strict sequencing is so important
- Non-determinism would make recovery impossible

Recovery time target: < 30 seconds (exchange rule: halt < 30 sec)

Handling Market Open (Call Auction)

At 9:30 AM, there are accumulated overnight orders. The exchange doesn’t just start matching β€” it runs a call auction:

Pre-open (9:00 - 9:30 AM):
- Accept limit orders but don't match yet
- Show indicative opening price (what price maximizes executed shares)

Opening auction (9:30 AM):
- Calculate equilibrium price: price that maximizes matched quantity
- Execute all matching orders simultaneously at that single price
- Publish opening trade

Example:
Buy orders:  100 @ $150, 200 @ $149, 300 @ $148
Sell orders: 150 @ $148, 200 @ $149, 100 @ $150
Equilibrium: $149 (matches 400 shares on each side)
β†’ All orders at $149 or better matched at $149

Design Summary

Complete Exchange Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                         EXCHANGE PERIMETER                          β”‚
β”‚                                                                     β”‚
β”‚  [FIX Client]──→[Client Gateway]──→[Order Manager]                 β”‚
β”‚  [Web Client]──→[Client Gateway]    (risk checks)                  β”‚
β”‚                        ↑                  β”‚                         β”‚
β”‚              (execution reports)          β”‚                         β”‚
β”‚                                           β–Ό                         β”‚
β”‚                              β”Œβ”€β”€β”€β”€[Sequencer]────┐                  β”‚
β”‚                              β”‚  (event log/mmap) β”‚                  β”‚
β”‚                              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                  β”‚
β”‚                                        β”‚                            β”‚
β”‚                             β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                 β”‚
β”‚                             β”‚   Matching Engine   β”‚                 β”‚
β”‚                             β”‚  (LMAX Disruptor    β”‚                 β”‚
β”‚                             β”‚   ring buffer,      β”‚                 β”‚
β”‚                             β”‚   order book        β”‚                 β”‚
β”‚                             β”‚   TreeMap+LinkedList)β”‚                β”‚
β”‚                             β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                 β”‚
β”‚                                      β”‚                              β”‚
β”‚              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”          β”‚
β”‚              β”‚                       β”‚                   β”‚          β”‚
β”‚              β–Ό                       β–Ό                   β–Ό          β”‚
β”‚      [Trade Reporter]    [Market Data Publisher]  [Risk Engine]     β”‚
β”‚     (execution fills)   (multicast UDP feed)    (circuit breaker)  β”‚
β”‚                                      β”‚                              β”‚
β”‚                             [Bloomberg/Reuters]                     β”‚
β”‚                             [Broker Data Feeds]                     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                       β”‚
                              [Audit / Compliance]
                              (immutable event log,
                               trade surveillance)

Key Decisions Summary

DecisionChoiceReasoning
Ordering guaranteeSingle sequencer + monotonic seq numbersEliminates race conditions in matching
Order book data structureTreeMap + doubly-linked list per price levelO(1) match/cancel, O(log N) add
Matching algorithmPrice-time priority (FIFO)Industry standard, fair
Inter-component messagingLMAX Disruptor ring bufferLock-free, < 1ΞΌs latency
PersistenceMemory-mapped files (mmap)Near-zero overhead persistence
NetworkingDPDK kernel bypass1-5ΞΌs NIC-to-app latency
Audit trailEvent sourcing (append-only log)Regulatory compliance + replay
RecoverySnapshot + event replayDeterministic state reconstruction
RiskPre-trade + circuit breakersMarket stability
Market dataMulticast UDP + sequence numbersLow-latency fan-out

Interview Questions & Answers

Q: Why do you need a sequencer? Why not just let the matching engine handle ordering?
A: Without a sequencer, orders arriving at multiple gateways simultaneously create a race condition β€” no single β€œtruth” about which order came first. The sequencer is a single point that assigns monotonically increasing sequence numbers before any order reaches the matching engine. This means all matching engine processing is deterministic and strictly ordered. It also enables recovery: replay the event log in sequence order and you get exactly the same order book state.

Q: Walk me through the order book data structure and why you chose it.
A: The order book uses a TreeMap<price, LinkedList<Order>> separately for bids and asks. TreeMap gives O(log N) insertion but O(1) access to best bid/ask (head of the map). Each price level is a doubly-linked list to support O(1) FIFO: new orders append to tail, executions remove from head, cancels use a pointer from a HashMap to remove from any position in O(1). This combination gives us O(1) matching, O(1) cancel, and O(log N) add β€” which is the industry-standard approach.

Q: What is the LMAX Disruptor and why is it used in trading systems?
A: LMAX Disruptor is a lock-free inter-thread communication library using a ring buffer. Traditional queues use locks, which cause thread contention and unpredictable latency spikes. The Disruptor uses a pre-allocated fixed-size ring buffer where producers and consumers communicate via atomic sequence number updates β€” no locks, no heap allocations, no GC. Sequential memory access maximizes CPU cache utilization. This gives < 1 microsecond latency compared to > 1 millisecond for locked queues. LMAX Exchange processes 6 million transactions/second with 1-microsecond latency using this approach.

Q: How do you handle a matching engine crash? How do you guarantee no orders are lost?
A: Event sourcing plus checkpointing. Every message is written to the sequencer’s durable event log (memory-mapped file) with a sequence number before processing. Periodically, a snapshot of the full order book state is taken. On crash: load the latest snapshot, then replay all events from the snapshot’s sequence number forward in strict order. Since matching is deterministic (same inputs β†’ same outputs), the order book is reconstructed identically. Recovery target is under 30 seconds.

Q: What is a circuit breaker in a stock exchange context? How does it work?
A: A circuit breaker halts trading when price movement exceeds a threshold, preventing panic-driven crashes. Market-wide circuit breakers (NYSE rules): S&P 500 drops 7% β†’ 15-minute halt; 13% β†’ another 15-minute halt; 20% β†’ halt for the day. Single-stock circuit breakers (LULD): if a stock price moves more than 5% in 5 minutes, a short pause allows the market to stabilize. Implementation: the risk engine monitors the rolling price change; on trigger, it sends a halt message to the sequencer, which stops accepting new orders (but allows cancels). After the cooling-off period, trading resumes with a brief auction.

Q: Why do you use integers for prices instead of floating point?
A: Floating-point arithmetic is non-deterministic across hardware and introduces rounding errors β€” catastrophic in financial systems. A price of 150.04999999… or 150.05 = 15005 cents. Integer comparison is exact, consistent, and faster. All exchanges use integer price representation internally.


Key Takeaways

  1. Sequencer is the architectural centerpiece β€” single source of ordering truth, enables deterministic matching and crash recovery
  2. Order book = TreeMap + doubly-linked list β€” gives O(1) match/cancel, O(log N) add; price levels are linked lists for FIFO
  3. Price-time priority β€” best price wins; ties broken by arrival time (sequence number); this is the law of fair markets
  4. LMAX Disruptor β€” lock-free ring buffer; removes all locks and GC pressure from the hot path; enables sub-microsecond inter-thread messaging
  5. Event sourcing is not optional β€” regulatory requirement AND the mechanism for crash recovery; immutable append-only event log
  6. Two types of risk checks β€” pre-trade (stop bad orders before they match) and circuit breakers (halt trading to prevent market crashes)
  7. Use integers for prices β€” floating-point is non-deterministic; store price as integer cents/basis points
  8. Low-latency is a holistic effort β€” kernel bypass (DPDK), memory-mapped files, CPU pinning, NUMA awareness, co-location; no single trick is enough


Practice this design! The stock exchange is the hardest system design question. Be ready to:

  1. Draw the full architecture with all components and data flows
  2. Explain the order book data structure from scratch (TreeMap + linked list)
  3. Justify why the sequencer is needed (race condition elimination)
  4. Walk through price-time priority matching with a worked example
  5. Explain LMAX Disruptor vs traditional queues

Last Updated: 2026-04-13
Status: The most complex chapter β€” know it cold for FAANG and fintech interviews!