Chapter 8: Design a Distributed Email Service

volume2 email smtp imap distributed-systems

Status: 🟩 Interview ready
Difficulty: Hard
Time to complete: 50 min read + practice


Overview

A distributed email service lets billions of users send, receive, store, and search emails at massive scale. Gmail and Outlook are the canonical examples β€” each handling hundreds of billions of emails per year.

Why this matters:

  • Complex distributed storage problem (not just a simple CRUD app)
  • Teaches email protocols (SMTP, IMAP, POP3) interviewers expect you to know
  • Deep dive into storage systems: when to use Cassandra vs object store vs Elasticsearch
  • Common at Google, Microsoft, Yahoo interviews

Problem Statement

Design a distributed email service that:

  • Handles 1 billion users, 10 billion emails per day
  • Reliably delivers emails (no loss, at-least-once)
  • Stores emails with metadata (sender, subject, date, labels/folders)
  • Supports full-text search within email body and subject
  • Supports email threading (conversation view)
  • Handles attachments up to 25MB
  • Sends push notifications for new emails
  • Provides mobile sync protocol

Step 1: Requirements & Scope (5 min)

Functional Requirements

Clarifying questions:

  • Scale? β†’ 1 billion users, 10 billion emails/day
  • Is delivery guarantee required? β†’ At-least-once with deduplication
  • Search? β†’ Full-text search on subject, sender, body
  • Attachments? β†’ Yes, up to 25MB
  • Threading? β†’ Yes, conversation grouping
  • Real-time notifications? β†’ Yes, push notifications for new emails
  • Mobile sync? β†’ Yes
  • Spam filtering? β†’ Yes

Scope:

  • Send emails (to users, to external SMTP servers)
  • Receive emails (from external SMTP servers)
  • Read, delete, search emails
  • Organize with labels/folders
  • Thread emails into conversations
  • Push notifications for new email

Non-Functional Requirements

  • Reliability: No email loss (at-least-once delivery)
  • Scale: 1B users, 10B emails/day
  • Availability: 99.99% (emails are mission-critical)
  • Search performance: Search results < 500ms
  • Storage durability: 99.9999999% (nine nines for email data)
  • Low latency delivery: Email delivered within seconds
  • Attachment deduplication: Same file shared between emails not stored twice

Scale Estimation

Users:               1,000,000,000 (1B)
Emails/day:          10,000,000,000 (10B)
Emails/second:       10B / 86,400 = ~115,740 emails/sec
Peak:                ~3Γ— average = ~350,000 emails/sec

Storage per email:
  - Metadata (headers, labels, status):  ~1 KB
  - Body (average HTML email):           ~50 KB
  - Attachment (some emails):            ~500 KB avg if attached

Daily storage (metadata only):
  10B emails Γ— 1 KB = 10 TB/day metadata

Daily storage (body):
  10B emails Γ— 50 KB = 500 TB/day

1-year storage for bodies:
  500 TB/day Γ— 365 = ~182 PB/year (with compression and dedup: much less)

Attachments (assume 20% of emails have attachments):
  2B emails Γ— 500 KB = 1 PB/day

Step 2: High-Level Design (10 min)

Email Protocols

SMTP (Simple Mail Transfer Protocol):

  • Used for sending email
  • Port 587 (submission) or 465 (SSL)
  • Sender β†’ Sender’s SMTP server β†’ Recipient’s SMTP server
  • Text-based, stateless

IMAP (Internet Message Access Protocol):

  • Used for receiving/accessing email
  • Port 993 (SSL)
  • Emails stay on server, synced to client
  • Supports multiple devices, folders, read/unread state
  • Modern clients (Gmail app, Outlook) use IMAP or a custom sync protocol

POP3 (Post Office Protocol 3):

  • Used for downloading email
  • Port 995 (SSL)
  • Downloads and deletes from server (or leaves copy)
  • No sync β€” not suitable for multi-device use
  • Legacy protocol, largely replaced by IMAP

SMTP vs IMAP vs POP3:

ProtocolDirectionStores on server?Multi-device?Use Today
SMTPSendingNo (transport)N/AYes (sending)
IMAPReceivingYesYesYes (receiving)
POP3ReceivingNo (downloads)NoLegacy only

Sending Email Flow

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     Sending an Email                         β”‚
β”‚                                                              β”‚
β”‚ User App                                                     β”‚
β”‚   β”‚                                                          β”‚
β”‚   β–Ό  HTTPS / SMTP submission                                 β”‚
β”‚ API Server / SMTP Submission Server                         β”‚
β”‚   β”‚                                                          β”‚
β”‚   β–Ό async                                                    β”‚
β”‚ Message Queue (Kafka)                                        β”‚
β”‚   β”‚                                                          β”‚
β”‚   β–Ό                                                          β”‚
β”‚ Outgoing SMTP Worker                                         β”‚
β”‚   β”œβ”€ Validate sender domain (SPF, DKIM)                     β”‚
β”‚   β”œβ”€ Spam check                                              β”‚
β”‚   β”œβ”€ Resolve recipient's MX record (DNS)                    β”‚
β”‚   └─ SMTP connect β†’ Recipient's Mail Server                 β”‚
β”‚         β–Ό                                                    β”‚
β”‚       External Recipient (e.g., Yahoo, Outlook)             β”‚
β”‚         OR                                                   β”‚
β”‚       Internal user (route to receiving pipeline below)      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Receiving Email Flow

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Receiving an Email                        β”‚
β”‚                                                              β”‚
β”‚ External Sender's SMTP Server                               β”‚
β”‚   β”‚  SMTP (port 25)                                         β”‚
β”‚   β–Ό                                                          β”‚
β”‚ Incoming SMTP Server (our system)                           β”‚
β”‚   β”œβ”€ Validate: SPF, DKIM, DMARC                            β”‚
β”‚   β”œβ”€ Rate limit incoming                                     β”‚
β”‚   └─ Push to Message Queue                                  β”‚
β”‚         β”‚                                                    β”‚
β”‚         β–Ό  Kafka                                             β”‚
β”‚   Mail Processing Workers                                   β”‚
β”‚   β”œβ”€ Spam filtering (ML + rules)                           β”‚
β”‚   β”œβ”€ Virus/attachment scanning                              β”‚
β”‚   β”œβ”€ Attachment extraction β†’ Object Store (S3)             β”‚
β”‚   β”œβ”€ Assign labels (Inbox, Spam, etc.)                      β”‚
β”‚   β”œβ”€ Assign conversation_id (threading)                     β”‚
β”‚   └─ Store email metadata + body                            β”‚
β”‚         β”‚                                                    β”‚
β”‚         β–Ό                                                    β”‚
β”‚   User's Mailbox Storage                                    β”‚
β”‚   β”œβ”€ Metadata β†’ Cassandra (fast read, multi-device sync)   β”‚
β”‚   β”œβ”€ Body β†’ Object Store (S3, immutable)                   β”‚
β”‚   └─ Search Index β†’ Elasticsearch (async, eventual)        β”‚
β”‚         β”‚                                                    β”‚
β”‚         β–Ό                                                    β”‚
β”‚   Push Notification Service                                 β”‚
β”‚   └─ Notify user's devices (iOS, Android, web)             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Data Models

Email Metadata (Cassandra):

Table: email_metadata
  user_id          UUID         -- partition key (owner of mailbox)
  message_id       UUID         -- clustering key (sort descending by time)
  conversation_id  UUID         -- for threading
  subject          TEXT
  sender           TEXT
  recipients       LIST<TEXT>
  date             TIMESTAMP
  read             BOOLEAN
  labels           SET<TEXT>    -- ["INBOX", "STARRED", "WORK"]
  body_s3_key      TEXT         -- reference to body in S3
  attachment_ids   LIST<UUID>   -- references to attachment metadata
  size_bytes       INT
  has_attachment   BOOLEAN

Email Body (S3):

Key:     emails/{year}/{month}/{day}/{message_id}.eml
Content: Full RFC 2822 email body (text or HTML or multipart)
ACL:     Private (accessed via signed URLs)

Attachment (S3 + metadata table):

Table: attachment_metadata
  attachment_id    UUID  (PK)
  message_id       UUID
  content_hash     TEXT  (SHA-256 of file content)
  s3_key           TEXT  (reference to S3 object)
  filename         TEXT
  mime_type        TEXT
  size_bytes       INT

S3 Key:  attachments/{content_hash}  (dedup by hash!)

Step 3: Deep Dive (25 min)

Email Storage at Scale β€” Why Not Standard Databases?

Option 1: Relational DB (MySQL, PostgreSQL)

Problems at 10B emails/day:
  - Write throughput: 115,740 inserts/sec β†’ MySQL max ~10K TPS per node
  - Full-text search: LIKE '%query%' doesn't scale at billions of rows
  - JOINs: email + metadata + labels + attachments = expensive JOINs
  - Sharding: email data doesn't fit on one server
  - Column size: storing email body as BLOB is inefficient

Option 2: Standard NoSQL (MongoDB, DynamoDB)

Problems:
  - No efficient full-text search built in
  - MongoDB struggles with 182 PB of data at Gmail scale
  - Limited support for complex label/conversation queries
  - Still need a separate search solution

Option 3: Custom Distributed Storage (Gmail’s approach)

Gmail uses Bigtable (wide-column store) for email storage.
Microsoft Exchange uses its own custom storage engine.

Key insight: Email has a natural schema:
  - Keyed by (user_id, timestamp) β†’ wide-column fits perfectly
  - Read patterns: "give me all emails for user X sorted by time"
  - Append-heavy (new emails added, rarely modified)

Recommended Architecture for the Interview:

DataStorageWhy
Email metadata (headers, labels, read status)CassandraHigh write throughput, partition by user_id, sorted by time
Email body (HTML/text content)S3 (object store)Immutable large blobs, cheap, durable
AttachmentsS3 (deduplicated by content hash)Same file across multiple emails stored once
Search indexElasticsearchFull-text search, near-real-time
Unread counts, session stateRedisFast counters, ephemeral state

Cassandra Data Model for Email Metadata

Why Cassandra?

  • Partition by user_id β†’ all of a user’s emails on same node(s)
  • Cluster by message_id (time-based UUID) β†’ sorted by time automatically
  • High write throughput (10B emails/day = 115K writes/sec β†’ Cassandra handles this)
  • No single point of failure (masterless, multi-datacenter replication)
  • TTL support for automatic deletion of old emails
-- CQL Schema
CREATE TABLE email_metadata (
    user_id         UUID,
    message_id      TIMEUUID,   -- time-based UUID, sorts chronologically
    conversation_id UUID,
    subject         TEXT,
    sender          TEXT,
    read            BOOLEAN,
    labels          SET<TEXT>,
    body_s3_key     TEXT,
    has_attachment  BOOLEAN,
    size_bytes      INT,
    PRIMARY KEY (user_id, message_id)
) WITH CLUSTERING ORDER BY (message_id DESC);  -- newest first

-- Query: get latest 50 emails for user X
SELECT * FROM email_metadata
WHERE user_id = X
LIMIT 50;

-- Query: get emails in specific label
SELECT * FROM email_metadata
WHERE user_id = X AND labels CONTAINS 'INBOX';

Separate table for label index (Cassandra secondary indices are slow for high cardinality β€” use materialized views or separate tables):

CREATE TABLE email_by_label (
    user_id    UUID,
    label      TEXT,
    message_id TIMEUUID,
    PRIMARY KEY ((user_id, label), message_id)
) WITH CLUSTERING ORDER BY (message_id DESC);

Object Storage for Email Body

Why separate body from metadata?

Metadata query: "show me inbox" β†’ needs subject, sender, date, labels
                                β†’ does NOT need full body
Body fetch:     user clicks email β†’ fetch body from S3 by key

Splitting means:
  - Inbox load = fast Cassandra read (no large blobs)
  - Body fetch = on-demand S3 read (only when opened)
  - 80% of emails are never opened again after first read
    β†’ body stays in S3 (cold storage after 30 days, glacier after 1 year)

Storage lifecycle:

Day 0-30:    S3 Standard (fast access, $0.023/GB)
Day 30-365:  S3 Infrequent Access ($0.0125/GB)
Day 365+:    S3 Glacier (archive, $0.004/GB)

Attachment Deduplication with Content Hash

Problem: The same 5MB PDF attached to 1M emails = 5 PB wasted storage.

Solution: Content-addressed storage using SHA-256 hash.

Sending attachment:
1. Client sends attachment
2. Server computes SHA-256 hash = "abc123..."
3. Check if S3 key "attachments/abc123..." exists
4. If exists: skip upload, just reference existing S3 object
5. If not exists: upload to S3 at key "attachments/abc123..."
6. Store attachment_metadata record with content_hash = "abc123..."

Result: Same file stored exactly once regardless of how many emails reference it.
def store_attachment(file_bytes):
    content_hash = hashlib.sha256(file_bytes).hexdigest()
    s3_key = f"attachments/{content_hash}"
 
    # Check if already stored (dedup)
    if not s3.exists(s3_key):
        s3.put(s3_key, file_bytes)
 
    return content_hash  # Store in email metadata, not the file itself

Email Threading with conversation_id

Problem: Group related emails into a conversation (Gmail’s threaded view).

How email threading works:

RFC 2822 email headers:
  Message-ID: <abc123@gmail.com>         -- unique ID of this email
  In-Reply-To: <xyz789@gmail.com>        -- ID of email being replied to
  References: <xyz789@gmail.com> <def456@gmail.com>  -- full thread chain

Algorithm:

def assign_conversation_id(email):
    if email.in_reply_to:
        # Find the parent email
        parent = find_email_by_message_id(email.in_reply_to)
        if parent:
            # Join the same conversation
            return parent.conversation_id
 
    # No parent found or it's a new email β†’ start new conversation
    return generate_new_uuid()

Storage: conversation_id stored in Cassandra email_metadata. To load a thread: query WHERE conversation_id = X (needs a secondary index or materialized view in Cassandra).


Full-Text Search with Elasticsearch

Why Elasticsearch?

  • SQL LIKE '%keyword%' doesn’t scale to billions of emails
  • Cassandra has no efficient full-text search
  • Elasticsearch: inverted index, sub-second search across billions of documents
Email stored in DB β†’ Async indexer β†’ Elasticsearch
(near-real-time: indexed within seconds of receipt)

Search query: "subject:invoice from:vendor@company.com"

Elasticsearch query:
{
  "query": {
    "bool": {
      "must": [
        { "match": { "subject": "invoice" } },
        { "term":  { "sender": "vendor@company.com" } }
      ],
      "filter": [
        { "term": { "user_id": "user-uuid-123" } }
      ]
    }
  }
}

Elasticsearch index per user vs shared index:

  • Per-user index: Strong isolation, but millions of tiny indices are expensive
  • Shared index with user_id filter: Better resource utilization
  • Tenant-based sharding: Group users into index shards (balance between isolation and efficiency)

What to index:

  • subject (analyzed: tokenized for full-text)
  • sender / recipients (keyword: exact match)
  • body snippet (analyzed, truncated to ~200 chars for performance)
  • labels (keyword)
  • date (date range queries)

What NOT to index in ES:

  • Full email body (index text snippets + reference S3 key)
  • Attachments (index filenames/MIME types, not content)

Delivery Guarantees

At-least-once delivery with deduplication:

Problem: "Exactly once" is impossible in distributed systems.
         At-most-once = may lose emails (unacceptable).
         At-least-once + dedup = practical guarantee.

How:
1. Incoming SMTP server assigns a unique message_id to every email
2. Email pushed to Kafka topic with message_id as key
3. Processing worker consumes email
4. Before storing: check if message_id already exists in Cassandra
5. If exists: skip (already processed = idempotent consumer)
6. If not: store email

The message_id (from email headers or assigned by SMTP server) acts
as the deduplication key.

Message queue durability:

Kafka topic for incoming emails:
  - Replication factor: 3 (survives 2 broker failures)
  - Retention: 7 days (replay if processing worker fails)
  - Consumer group offset committed after successful storage
  - If worker crashes: re-read from last committed offset

Dead Letter Queue for Undeliverable Emails

Processing pipeline:
  Kafka β†’ Mail Worker β†’ [process email]
                            β”‚
                 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                 β”‚                     β”‚
              Success              Failure (after 3 retries)
                 β”‚                     β”‚
           Store in DB           β†’ Dead Letter Queue (DLQ)
                                        β”‚
                                  Operations team reviews
                                  (spam, virus, corrupt data)
                                        β”‚
                                  Manual retry or discard

Reasons email lands in DLQ:

  • Corrupt email format (can’t parse)
  • Virus detected in attachment (quarantine)
  • Storage failure (rare)
  • User storage quota exceeded

Spam Filtering

Multi-layer approach:

Layer 1: IP reputation (before SMTP connection accepted)
  - Check sender IP against known spam blacklists (Spamhaus, etc.)
  - Block at network level β†’ cheapest check

Layer 2: Email authentication (SMTP handshake)
  - SPF: Verify sender's IP is authorized for their domain
  - DKIM: Verify email signature (wasn't tampered with)
  - DMARC: Policy for how to handle SPF/DKIM failures

Layer 3: Content filtering (during processing)
  - Rule-based: Known spam phrases, suspicious links, attachment types
  - ML-based: Trained on billions of spam/ham examples
  - Collaborative filtering: If many users mark same sender as spam β†’ block

Layer 4: User feedback
  - User clicks "Report spam" β†’ trains the ML model
  - "Not spam" (false positive) β†’ whitelists sender for that user

Label System vs Folder System

IMAP Folders (traditional email):

  • Email belongs to exactly one folder
  • Moving email = physical move between folders
  • Simple but inflexible

Gmail Labels (modern approach):

  • Email can have multiple labels: [β€œINBOX”, β€œSTARRED”, β€œWORK”]
  • Labels are metadata, not physical locations
  • β€œMove to folder” = add label + remove INBOX label
  • Archive = remove INBOX label (email still accessible with All Mail label)

Storage: Labels stored as SET<TEXT> in Cassandra email_metadata. Label index table enables efficient β€œshow all emails with INBOX label” queries.


Mobile Sync Protocol

Challenge: Mobile clients can be offline for hours. When they reconnect, how do they efficiently sync?

Sync Approaches:
1. Full sync: Download everything β†’ too slow/expensive
2. Delta sync: Download only changes since last sync

Delta sync with sequence numbers:
  - Each user has a global sequence number (monotonic counter)
  - Every change (new email, read, delete, label change) increments it
  - Client stores last_sync_sequence = 12345
  - On reconnect: GET /sync?since=12345
  - Server returns: all changes with sequence > 12345
  - Client applies changes, updates last_sync_sequence

Push vs pull:

  • New email arrives β†’ push notification via APNS (iOS) or FCM (Android)
  • Notification wakes app β†’ app pulls delta sync from API
  • Silent push notifications don’t wake screen but sync background

Final Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                         Clients                                β”‚
β”‚           (Web, iOS, Android) β€” HTTPS + WebSocket/SSE         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                           β”‚
                           β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                      API Gateway                               β”‚
β”‚              Auth | Rate Limiting | Routing                    β”‚
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
       β”‚                β”‚                    β”‚
       β–Ό                β–Ό                    β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Webmail   β”‚  β”‚ SMTP Submit  β”‚  β”‚  Notification Service β”‚
β”‚  API       β”‚  β”‚ Server       β”‚  β”‚  (APNs/FCM/WebSocket) β”‚
β”‚ (read/     β”‚  β”‚ (outgoing)   β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚  search/   β”‚  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚  label)    β”‚         β”‚ Kafka
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜         β–Ό
       β”‚        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
       β”‚        β”‚ Outgoing     β”‚
       β”‚        β”‚ SMTP Workers β”‚
       β”‚        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
       β”‚
β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                   Storage Layer                              β”‚
β”‚                                                             β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚
β”‚  β”‚   Cassandra    β”‚  β”‚  S3      β”‚  β”‚  Elasticsearch  β”‚    β”‚
β”‚  β”‚  (email meta  β”‚  β”‚ (bodies, β”‚  β”‚  (full-text     β”‚    β”‚
β”‚  β”‚   + labels)   β”‚  β”‚ attachm.)β”‚  β”‚   search index) β”‚    β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚
β”‚                                                             β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                          β”‚
β”‚  β”‚  Redis (unread counts,       β”‚                          β”‚
β”‚  β”‚  session, rate limits)       β”‚                          β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Incoming SMTP (port 25):
External Sender β†’ Incoming SMTP Server β†’ Kafka β†’ Mail Processing Workers
β†’ [Spam filter | Attachment scan | Thread assign] β†’ Cassandra + S3 + ES

Design Summary β€” Full Email Receive Flow

1. External sender sends email to user@ourdomain.com

2. Incoming SMTP server (port 25):
   └─ Validate SPF, DKIM, DMARC
   └─ Rate limit by sender IP
   └─ Assign unique message_id (if not present in headers)
   └─ Push email to Kafka topic "incoming_emails"

3. Mail processing worker (Kafka consumer):
   └─ Check idempotency: message_id already in Cassandra? β†’ skip
   └─ Spam filter layers 1-4
   └─ Virus scan attachments
   └─ Extract attachments β†’ compute SHA-256 β†’ store in S3 (dedup)
   └─ Store email body in S3: "emails/2026/04/13/{message_id}.eml"
   └─ Assign conversation_id (check In-Reply-To header)
   └─ Assign labels: ["INBOX"] (or ["SPAM"] if spam)

4. Write to Cassandra:
   └─ INSERT INTO email_metadata (user_id, message_id, subject, sender,
        read=false, labels={"INBOX"}, body_s3_key, conversation_id)
   └─ INSERT INTO email_by_label (user_id, "INBOX", message_id)

5. Async index in Elasticsearch:
   └─ Index: {user_id, message_id, subject, sender, body_snippet, labels}

6. Push notification:
   └─ Notification Service β†’ APNs (iOS) / FCM (Android) / WebSocket (web)
   └─ "New email from Alice: Re: Invoice Q1 2026"

7. User opens email:
   └─ Read metadata from Cassandra (fast)
   └─ Fetch body from S3 via signed URL
   └─ UPDATE email_metadata SET read = true WHERE ...

Interview Questions & Answers

Q: Why not use a relational database for email storage?
A: Three problems at Gmail scale: (1) Write throughput β€” 115K inserts/sec exceeds single-node MySQL capacity, requiring complex sharding. (2) Full-text search β€” SQL LIKE queries don’t scale to billions of rows; need a dedicated search engine. (3) Storage size β€” 500 TB/day of email bodies stored as BLOBs in a relational DB would be extremely inefficient compared to object storage at a fraction of the cost. Instead: Cassandra for metadata (write-optimized, partitioned by user), S3 for bodies (cheap, durable), Elasticsearch for search.

Q: How do you ensure no emails are lost (delivery guarantee)?
A: At-least-once delivery with deduplication: (1) Kafka for the message queue β€” replicated across 3 brokers, 7-day retention. Consumer commits offset only after successful DB write. If worker crashes, re-reads from last committed offset. (2) Each email has a unique message_id. Before storing, check if message_id already exists (idempotency). (3) Failed emails after 3 retries go to Dead Letter Queue for manual review. The combination of durable queue + idempotent consumer + DLQ ensures practical exactly-once semantics.

Q: How does email threading work?
A: RFC 2822 email headers contain Message-ID (unique ID of the email) and In-Reply-To (ID of the email being replied to). On receipt, we check In-Reply-To β€” if we find the parent email in our storage, we assign the same conversation_id. If no parent found, we assign a new conversation_id. All emails in the same thread share the same conversation_id, enabling efficient thread loading with a single query.

Q: How would you implement full-text email search?
A: Elasticsearch with an inverted index. After storing an email in Cassandra + S3, we asynchronously push index documents to Elasticsearch: {user_id, message_id, subject (analyzed), sender (keyword), body snippet (analyzed, ~200 chars), labels (keyword), date}. Search query includes user_id as a mandatory filter for security. Near-real-time indexing means search is available within ~1 second of email receipt. For privacy/multi-tenancy, we shard the index by user cohorts.

Q: How do you handle a user who is over their storage quota?
A: (1) Quota tracked in real-time: Redis counter per user, incremented on email receive, decremented on delete. (2) On new email: check quota before storing. If over quota: email rejected at SMTP level with 452 error code (β€œinsufficient storage”), sender receives bounce notification. (3) Warning notifications: push notification and email when at 80%, 90%, 95% of quota. (4) Graceful degradation: read-only mode (can read and delete existing emails but cannot receive new ones). (5) Quota updates propagated via quota service, not inline to keep the critical path fast.


Key Takeaways

  1. Never store email bodies in relational DB at scale: Use object storage (S3) for bodies/attachments, wide-column store (Cassandra) for metadata
  2. Attachment deduplication with SHA-256: Same content stored once regardless of how many emails reference it β€” critical at PB scale
  3. At-least-once + idempotent consumer = practical exactly-once: message_id deduplication before DB write handles duplicates from queue retries
  4. Elasticsearch for search, Cassandra for storage: Each tool does one thing well β€” don’t try to make Cassandra searchable or Elasticsearch the system of record
  5. Email threading via In-Reply-To header: conversation_id assigned at processing time based on RFC 2822 headers β€” no complex graph traversal needed at query time
  6. Dead letter queue: Unprocessable emails need a holding area for manual review β€” never silently drop
  7. Two-protocol reality: SMTP sends outgoing mail; IMAP/custom sync protocol handles client-side access. Understanding this split is expected in interviews


Last Updated: 2026-04-13
Status: Hard but very high-signal interview β€” tests storage system design, protocol knowledge, and distributed systems patterns