Chapter 13 Flashcards - Stock Exchange
flashcards volume2 stock-exchange trading matching-engine low-latency
What is a stock exchange’s core job and why is it technically hard?
?
Core job: Match buy and sell orders for securities, execute trades, and publish prices in real-time. Technically hard because: sub-millisecond P99 latency (< 1ms, often microseconds), strict ordering guarantee (price-time priority, no race conditions), 1 billion orders/day throughput, immutable audit trail for regulators, and crash recovery must produce byte-identical state. Combines every hard distributed systems problem simultaneously.
What are the three main order types in a stock exchange?
?
Market order: Execute immediately at best available price, guaranteed fill, risk of slippage in thin markets. Limit order: Execute only at specified price or better, may rest on book if no immediate match, no slippage risk but no guaranteed fill. Stop order: Dormant until price hits trigger level, then converts to market or limit order. Most exchange volume is limit orders resting in the book.
What is the order book? Describe its structure.
?
The order book is the central data structure of the matching engine. It holds all outstanding (unmatched) orders for a symbol, separated into: Bid side (buy orders) — sorted by price descending (highest first, most eager buyer at top), Ask side (sell orders) — sorted by price ascending (lowest first, most eager seller at top). At each price level, orders are queued FIFO. Spread = best ask price − best bid price. A tight spread means a liquid market.
What data structure implements the order book and why?
?
TreeMap<price, LinkedList
What is price-time priority (FIFO) matching?
?
The universal rule for fair order matching: Price wins first — the order at the best price (highest bid or lowest ask) always matches first. Time breaks ties — if two orders are at the same price, the one that arrived earlier (lower sequence number) executes first. Example: Two limit buy orders both at $150.00 — the one that arrived at 9:30:01 fills before the one at 9:30:02. This is enforced by the doubly-linked list FIFO structure at each price level.
What is the spread in an order book and what does it indicate?
?
Spread = Best Ask Price − Best Bid Price. Example: Best bid 150.10 → spread = 0.01) means high liquidity — many buyers and sellers, easy to trade without price impact. A wide spread means low liquidity — fewer participants, larger price impact per trade. Market makers earn the spread by continuously posting bid and ask quotes.
What is the sequencer and why is it the most critical component?
?
The sequencer assigns a monotonically increasing sequence number to every message before it reaches the matching engine. It is the single source of ordering truth. Without it: orders arriving at multiple gateways simultaneously create a race condition — matching becomes non-deterministic (unfair). With it: all matching engine processing is strictly deterministic (same sequence → same result). The sequencer also writes every message to a durable event log, enabling crash recovery by replaying events in order.
What is the LMAX Disruptor and why does trading use it?
?
LMAX Disruptor is a lock-free inter-thread communication library using a pre-allocated ring buffer. Traditional queues use locks → thread contention → GC pauses → latency spikes to milliseconds. Disruptor: fixed-size array, producers/consumers use atomic sequence number increments (no locks, no heap allocation, no GC), sequential memory access maximizes CPU cache hits. Achieves < 100 nanosecond inter-thread latency vs > 1 millisecond for locked queues. LMAX Exchange processes 6 million transactions/second with ~1μs latency using this.
What is a ring buffer and how does it work in the context of the Disruptor?
?
A ring buffer is a fixed-size, circular array (power of 2 size, e.g., 2^20 slots). Producer writes to slot at (sequenceNumber % bufferSize), Consumer reads from slot at its current sequence. No dynamic allocation (pre-allocated), no garbage collection. When producer laps consumer (buffer full), producer waits — but this is rare with properly sized buffers. Cache-friendly because sequential memory access. The Disruptor uses this as a high-speed inter-thread message passing mechanism.
What is event sourcing and why is it mandatory in a stock exchange?
?
Event sourcing: every state change is stored as an immutable event appended to a log. Current state = replay of all events. Events are never updated or deleted. Mandatory in stock exchange because: 1) Regulatory requirement — SEC/FINRA require complete audit trail of every order, modification, cancellation, and trade. 2) Crash recovery — replay the event log to reconstruct exact order book state after failure. 3) Debugging — replay events to reproduce any bug exactly. 4) Historical queries — reconstruct book state at any past time.
How does crash recovery work in a stock exchange?
?
Uses snapshot + event replay: Periodically take a snapshot of the full order book state (every N events, e.g., 100,000). Each snapshot tagged with the sequence number it covers. On crash: 1) Load latest snapshot (fast, restores most state), 2) Read event log from snapshot’s sequence number forward, 3) Replay all subsequent events in strict sequence order, 4) Order book is reconstructed identically (deterministic matching). Recovery target: < 30 seconds (exchange rules require this). Strict sequencing is what makes recovery deterministic.
What is kernel bypass networking (DPDK) and when do exchanges use it?
?
Normal networking: NIC → kernel interrupt → socket buffer → user-space copy → application. Latency: ~50 microseconds. DPDK (Data Plane Development Kit): NIC talks directly to user-space application via polling mode driver, bypassing the kernel entirely. Latency: 1-5 microseconds. Used for the lowest-latency links in an exchange — e.g., between client gateway and matching engine, or for market data dissemination. Requires dedicated CPU cores for polling (busy-wait). Also used: RDMA (Remote Direct Memory Access) for inter-machine zero-copy transfers.
What is co-location in the context of stock exchanges?
?
Co-location: trading firms pay the exchange to place their servers physically inside the exchange’s data center, sometimes in the same rack as the matching engine. Motivation: speed of light matters at sub-microsecond scale. A 100-meter fiber round trip = ~1 microsecond delay. Firms closer to the matching engine get orders there faster and receive market data first. Exchanges charge significant fees for co-location space and provide all participants with equal-length cables (fairness rule — no firm can get a physically shorter wire).
What are the low-latency optimization techniques for a matching engine?
?
- LMAX Disruptor (lock-free ring buffer, < 100ns inter-thread). 2. Kernel bypass (DPDK) (1-5μs NIC-to-app latency). 3. Memory-mapped files (near-zero overhead persistence). 4. CPU pinning (dedicated core for matching engine, no context switches). 5. NUMA awareness (matching engine memory on same NUMA node as CPU, avoids ~100ns cross-NUMA penalty). 6. Huge pages (2MB pages reduce TLB misses). 7. Avoid GC (use off-heap memory, pre-allocate objects, never allocate on hot path). 8. Co-location (physical proximity to exchange).
What is pre-trade risk and what checks does it perform?
?
Pre-trade risk checks happen before an order reaches the sequencer/matching engine. Gate: reject bad orders before they can affect the market. Checks: 1) Position limit: does this trade exceed max allowed position for this client? 2) Credit/buying power: does client have sufficient funds/margin? 3) Fat finger check: is the price wildly wrong (e.g., order to buy AAPL @ $1,500,000)? 4) Order size limit: is this order unusually large vs average daily volume? 5) Symbol validity: is this symbol tradeable right now? Pre-trade risk prevents market disruptions before they happen.
What is a circuit breaker in a stock exchange? How does it work?
?
A circuit breaker halts trading when price movement exceeds a threshold, preventing cascade crashes. Market-wide (NYSE rules): S&P 500 drops 7% → 15-min halt; drops 13% → another 15-min halt; drops 20% → halt for the day. Single-stock (LULD): price moves > 5% in 5 minutes → trading pause. Implementation: risk engine monitors rolling price change; on trigger, sends halt to sequencer which stops accepting new orders (but allows cancels). After cooling-off period, trading resumes with a brief call auction to find fair reopening price. Named “circuit breaker” by analogy to electrical circuit breakers that prevent overload.
What is the FIX protocol and why do exchanges use it?
?
FIX (Financial Information eXchange) is the universal messaging standard for electronic trading, used by virtually every exchange, broker, and trading firm since 1992. It defines message formats for: new orders, cancel requests, execution reports, order status, market data subscriptions. FIX is: binary-efficient, battle-tested, vendor-neutral, and understood by all market participants. The Client Gateway parses incoming FIX messages and translates them to internal format. Alternatives: FAST (FIX Adapted for Streaming) for market data, proprietary binary protocols for ultra-low latency.
What is a market data feed and what are its two main variants?
?
Market data feed broadcasts price and trade information to all market participants in real-time. Two variants: 1) L1 (Top of Book): only best bid price/qty, best ask price/qty, and last trade. Used by retail apps, display boards. 2) L2 (Full Order Book): all price levels with total quantity. Incremental updates (add, remove, match events). Used by professional traders and algorithms. Published via multicast UDP (single packet reaches all subscribers simultaneously — lower latency than TCP unicast). Messages include sequence numbers so recipients detect and recover gaps.
Why do exchanges store prices as integers rather than floating-point numbers?
?
Floating-point arithmetic is non-deterministic across hardware and introduces rounding errors. Example: 150.04999999… or 150.05 = 15005 cents, or for sub-cent precision = 1500500 ten-thousandths of a dollar. Integer comparison is exact, consistent, and faster.
What is the order lifecycle from client to execution?
?
- Client sends FIX order to Client Gateway (protocol parse, authentication). 2. Order Manager performs pre-trade risk checks (position limits, credit, fat finger). 3. Sequencer assigns monotonic sequence number, writes to event log. 4. Matching Engine processes in sequence order: check order book for matches, execute trades or rest on book. 5. Trade Reporter sends execution reports back to buyer and seller via Client Gateway. 6. Market Data Publisher broadcasts order book changes and trade prints via multicast. 7. All events written to audit log permanently.
What is a call auction and when does it run?
?
A call auction accumulates orders and matches them all simultaneously at a single price rather than matching continuously. Used at: Market open (9:30 AM) — overnight orders accumulated; call auction finds the equilibrium price that maximizes shares executed. Market reopen after circuit breaker halt — stabilizes price after disruption. Algorithm: find the price where total buy quantity ≥ total sell quantity (or vice versa) is maximized. All orders at that price or better execute at the same single price. This prevents the chaos of continuous matching when there’s a large order imbalance.
What is the difference between pre-trade and post-trade risk management?
?
Pre-trade risk (before matching): Reject orders that would violate limits. Checks: position limits, buying power/credit, fat-finger detection, symbol validity. Goal: prevent bad orders from entering the market. Fast — must add < 1 microsecond latency or traders will route elsewhere. Post-trade risk (after matching): Monitor portfolio exposure in real-time after each trade. Update realized P&L, unrealized P&L, margin requirements. Alert risk team as positions approach limits. Generate regulatory trade reports. Goal: ongoing surveillance and risk monitoring. Can be slightly slower since it runs asynchronously after execution.
How does the matching engine handle a market order vs a limit order?
?
Market order: Match immediately against best available price on the opposite side. Keep consuming price levels until fully filled. No price constraint = guaranteed fill but potential slippage. If order book is empty on that side, market order may partially fill or fail. Limit order: Only match against orders at the limit price or better. If the limit price finds no match (or partial match), the unfilled quantity rests on the book at that price level, waiting. Example: Limit Buy 100 @ 150 → rests on bid side at 150 or below.
What guarantees does strict sequencing provide, and what would break without it?
?
With strict sequencing: Every order has a unique monotonic sequence number. Matching engine processes in strict order = deterministic. Same sequence → same order book state always. Recovery = replay same events = identical state. Fairness = time-of-arrival determined by sequence number, not race conditions. Without it: Multiple gateways submit orders simultaneously → race condition in matching engine → non-deterministic which order “arrived first” → unfair market → potential regulatory violations → non-reproducible bugs → impossible to recover to exact pre-crash state. The sequencer is the architectural solution to the fundamental problem of concurrent order arrival.
What makes the stock exchange the hardest system design question?
?
It combines every hard distributed systems challenge simultaneously: 1) Ultra-low latency (microseconds, not milliseconds) — requires DPDK, ring buffers, CPU pinning. 2) Strict ordering guarantee — requires sequencer, not just distributed locking. 3) Regulatory compliance — mandatory event sourcing, immutable audit trail. 4) Deterministic recovery — crash recovery must produce byte-identical state. 5) Complex data structures — order book with O(1) match, cancel, FIFO. 6) Risk management — pre-trade checks, circuit breakers, post-trade monitoring. 7) Market data fanout — multicast to thousands of subscribers in real-time. No other design question has all of these.
How does market data multicast work and why is UDP used instead of TCP?
?
Market data is published via multicast UDP: the exchange sends a single UDP packet to a multicast group address, and all subscribed servers receive it simultaneously (handled by network switches/routers). TCP would require sending a separate stream to each subscriber — N connections, N times the bandwidth. Why UDP over TCP: 1) Lower latency (no TCP handshake, no ACK overhead). 2) Single packet reaches all subscribers simultaneously. 3) Packet loss is handled by sequence numbers — if a gap is detected, subscriber requests retransmission from a separate recovery feed (or uses the snapshot feed to catch up). Real-time price data is better to drop than to delay.
Total Cards: 25
Review Time: 20-30 minutes
Priority: HIGH — Most complex Vol 2 chapter. Critical for fintech and FAANG interviews.
Last Updated: 2026-04-13