Chapter 15: Design Google Drive (Cloud Storage Service)

volume1 google-drive cloud-storage sync file-system

Status: 🟩 Interview ready
Difficulty: Hard
Time to complete: 45 min read + practice


Overview

Google Drive is a cloud storage service that lets users store, sync, and share files across all their devices. Designing it means solving problems around large file handling, efficient sync (don’t re-upload unchanged data), conflict resolution, and strong reliability.

Why this matters:

  • Common hard interview question
  • Covers block storage, delta sync, consistency, conflict resolution
  • Real-world: Google Drive, Dropbox, iCloud, OneDrive, Box

Problem Statement

Design a cloud storage service that:

  • Lets users upload/download any file type
  • Syncs files across all of a user’s devices
  • Allows file sharing with other users
  • Keeps revision history (previous versions)
  • Works reliably with no data loss

Step 1: Requirements & Scope (5 min)

Functional Requirements

Clarifying questions:

  • What file types? β†’ Any file type (documents, photos, videos, zip files)
  • Mobile and desktop? β†’ Yes, all platforms
  • Sharing? β†’ Yes, with read-only or edit permissions
  • Versioning? β†’ Yes, keep previous revisions
  • Collaboration (real-time co-edit)? β†’ No, out of scope (focus on sync)
  • Offline access? β†’ Yes, sync when reconnected

Scope:

  • Upload and download files from any device
  • Sync changes across a user’s devices automatically
  • Share files/folders with specific users
  • View and restore previous file versions
  • Conflict detection and resolution

Non-Functional Requirements

  • Reliability: No data loss. Files must be durable across datacenter failures
  • Availability: 99.9% uptime (data sync can be slightly delayed, not lost)
  • Scalability: 50M users, 10M DAU
  • Performance: Fast sync (delta-only, not full re-upload)
  • Storage efficiency: Deduplication to reduce storage cost
  • Consistency: Strong consistency for metadata, eventual for file content

Scale Estimation

Users:
  50M total users, 10M DAU
  Each user: 10 GB free storage
  Total storage: 50M Γ— 10 GB = 500 PB

Read/write ratio:
  ~1:1 (users frequently upload AND download/view)

Upload requests:
  10M DAU Γ— 2 uploads/day avg = 20M uploads/day
  = 231 uploads/second

Metadata operations:
  File listing, search, version history = 10Γ— upload ops
  = 2,300 ops/second

Storage (with deduplication and compression):
  Assume 40% deduplication ratio
  500 PB raw Γ— 0.6 = 300 PB actual storage
  With replication (3Γ—): 900 PB total physical storage

Step 2: High-Level Design (10 min)

Core Flows

Flow 1: File Upload

Client β†’ Block Servers β†’ Cloud Storage (S3)
Client β†’ API Servers  β†’ Metadata DB (MySQL)

Flow 2: File Sync to Other Devices

Client A (uploads) β†’ API Servers β†’ Metadata DB
                                 β†’ Notification Service
                                          ↓
                              Client B (same user, another device)
                                      receives sync notification
                                      downloads only changed blocks

Flow 3: File Download

Client β†’ API Servers β†’ Metadata DB (find block list)
Client β†’ Block Servers / S3 (fetch blocks directly)
Client β†’ Reassemble blocks β†’ file

Component Overview

ComponentPurpose
Block ServersSplit files into blocks, compute hashes, compress, upload to S3
Cloud Storage (S3)Durable object storage for file blocks
Cold Storage (Glacier)Old file versions not recently accessed
Load BalancersDistribute API traffic
API ServersHandle file CRUD, share, version requests
Metadata DB (MySQL)Files, blocks, users, workspace, sharing info
Metadata Cache (Redis)Cache frequently accessed file metadata
Notification ServicePush sync events to connected clients
Offline Backup QueueQueue sync jobs for offline clients

High-Level Architecture Diagram

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Client A   β”‚     β”‚                API Layer                    β”‚
β”‚  (Laptop)    β”‚     β”‚                                             β”‚
β”‚  - Watcher   β”‚     β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  - Chunker   │────→│  β”‚  Load    β”‚  β”‚   API     β”‚  β”‚ Redis   β”‚ β”‚
β”‚  - Indexer   β”‚     β”‚  β”‚ Balancer │──│  Servers  │──│ Cache   β”‚ β”‚
β”‚  - Sync      β”‚     β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β”‚                      β”‚                     β”‚
                     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                            β”‚
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β–Ό                       β–Ό                       β–Ό
             β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
             β”‚  Metadata β”‚         β”‚  Block Servers β”‚     β”‚Notification  β”‚
             β”‚  DB       β”‚         β”‚  (hash, split, β”‚     β”‚  Service     β”‚
             β”‚ (MySQL)   β”‚         β”‚  compress)     β”‚     β”‚(Long Polling)β”‚
             β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜         β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
                                           β”‚                     β”‚
                                           β–Ό                     β–Ό
                                   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                                   β”‚  S3 (blocks)  β”‚    β”‚   Client B     β”‚
                                   β”‚               β”‚    β”‚   (Phone)      β”‚
                                   β”‚  Cold Storage β”‚    β”‚  receives sync β”‚
                                   β”‚  (Glacier for β”‚    β”‚  notification  β”‚
                                   β”‚   old vers.)  β”‚    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Step 3: Deep Dive (20 min)

Block Storage Design

Core idea: Instead of uploading whole files, split into fixed-size blocks and only sync changed blocks.

Why Blocks?

Without blocks:
  User edits 1 line in a 100 MB Word doc
  β†’ Must re-upload full 100 MB ❌ (wastes bandwidth)

With blocks (4 MB each):
  100 MB file = 25 blocks
  Edit affects only 1 block
  β†’ Upload only 1 changed block (4 MB) βœ… (25Γ— less bandwidth)

Block Design

Block properties:
  - Fixed size: 4 MB per block (tunable)
  - Content-addressable: block identified by SHA256(content)
  - Compressed: LZ4 or zlib before upload (20-40% size reduction)
  - Encrypted: AES-256 before upload (client-side encryption option)

File β†’ Blocks mapping:
  document.pdf (12 MB)
  β†’ Block A: SHA256=a1b2c3... (4 MB)
  β†’ Block B: SHA256=d4e5f6... (4 MB)
  β†’ Block C: SHA256=g7h8i9... (4 MB)

After edit (middle section changed):
  β†’ Block A: SHA256=a1b2c3... (unchanged, DO NOT re-upload)
  β†’ Block B: SHA256=x9y8z7... (CHANGED, upload this block)
  β†’ Block C: SHA256=g7h8i9... (unchanged, DO NOT re-upload)

Upload: Only 4 MB instead of 12 MB β†’ 3Γ— bandwidth saving

Block Deduplication

Same content, stored once:

User A uploads vacation_photo.jpg (5 MB)
  Block hash: SHA256 = "abc123"
  Stored in S3 at key: blocks/abc123

User B also has same photo (common: shared family photos)
  Block hash: SHA256 = "abc123"  (same!)
  S3 already has blocks/abc123 β†’ DO NOT store again!
  Just add reference in metadata DB: block abc123 β†’ file ref count ++

Storage saving: 1 copy stored, N users reference it

Block deduplication in Metadata DB:

-- blocks table: one row per unique block
CREATE TABLE blocks (
    block_hash   VARCHAR(64) PRIMARY KEY,  -- SHA256
    s3_key       VARCHAR(500) NOT NULL,
    size_bytes   INT,
    ref_count    INT DEFAULT 1,            -- how many files reference this
    created_at   TIMESTAMP
);
 
-- file_blocks: which blocks make up each file version
CREATE TABLE file_blocks (
    file_id      VARCHAR(36),
    version      INT,
    block_seq    INT,    -- position in file
    block_hash   VARCHAR(64),
    PRIMARY KEY (file_id, version, block_seq)
);

Delta Sync

Delta sync = only transfer the changed portion of a file.

Delta sync flow:
  1. Client Watcher detects file_changed event
  2. Client Chunker re-splits file into blocks
  3. Client Indexer computes SHA256 for each block
  4. Client compares new block hashes vs last-known block hashes
  5. Client sends only CHANGED blocks to Block Server
  6. Block Server stores new blocks in S3
  7. Metadata DB updated with new block list for new version

Example:
  report.docx v1: [blockA, blockB, blockC, blockD]
  report.docx v2: [blockA, blockB', blockC, blockD]  (blockB changed)

  Delta upload: Only blockB' (4 MB out of 16 MB) = 75% bandwidth saved

Sync Client Architecture

The client has four components working together:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Sync Client                           β”‚
β”‚                                                          β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚   Watcher   │──→│   Chunker    │──→│   Indexer    β”‚  β”‚
β”‚  β”‚             β”‚   β”‚              β”‚   β”‚              β”‚  β”‚
β”‚  β”‚ Monitors    β”‚   β”‚ Splits file  β”‚   β”‚ Computes     β”‚  β”‚
β”‚  β”‚ local file  β”‚   β”‚ into 4MB     β”‚   β”‚ SHA256 per   β”‚  β”‚
β”‚  β”‚ system for  β”‚   β”‚ blocks       β”‚   β”‚ block        β”‚  β”‚
β”‚  β”‚ changes     β”‚   β”‚              β”‚   β”‚ Updates      β”‚  β”‚
β”‚  β”‚ (inotify /  β”‚   β”‚              β”‚   β”‚ local DB     β”‚  β”‚
β”‚  β”‚  FSEvents)  β”‚   β”‚              β”‚   β”‚              β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚                                               β”‚          β”‚
β”‚                                       β”Œβ”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚                                       β”‚  Sync Engine  β”‚  β”‚
β”‚                                       β”‚               β”‚  β”‚
β”‚                                       β”‚ Computes diff β”‚  β”‚
β”‚                                       β”‚ (which blocks β”‚  β”‚
β”‚                                       β”‚ changed)      β”‚  β”‚
β”‚                                       β”‚ Uploads delta β”‚  β”‚
β”‚                                       β”‚ Handles retry β”‚  β”‚
β”‚                                       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
ComponentTechnologyResponsibility
Watcherinotify (Linux), FSEvents (macOS), ReadDirectoryChangesW (Windows)Detect file create/modify/delete/rename
ChunkerCustom splitterSplit file into 4 MB blocks at stable boundaries
IndexerSHA256 hashingCompute block hashes, maintain local block index DB
Sync EngineCore logicDiff old vs new block list, schedule uploads, handle conflicts

Metadata DB Design

-- Users
CREATE TABLE users (
    user_id    VARCHAR(36) PRIMARY KEY,
    email      VARCHAR(255) UNIQUE,
    storage_used_bytes  BIGINT DEFAULT 0,
    storage_limit_bytes BIGINT DEFAULT 10737418240  -- 10 GB
);
 
-- Files and Folders
CREATE TABLE files (
    file_id      VARCHAR(36) PRIMARY KEY,
    owner_id     VARCHAR(36),
    parent_id    VARCHAR(36),   -- parent folder (NULL = root)
    name         VARCHAR(500),
    is_folder    BOOLEAN DEFAULT FALSE,
    latest_version INT DEFAULT 1,
    trashed      BOOLEAN DEFAULT FALSE,
    created_at   TIMESTAMP,
    updated_at   TIMESTAMP,
    INDEX idx_owner_parent (owner_id, parent_id)
);
 
-- File Versions (revision history)
CREATE TABLE file_versions (
    file_id      VARCHAR(36),
    version      INT,
    created_at   TIMESTAMP,
    size_bytes   BIGINT,
    checksum     VARCHAR(64),   -- SHA256 of full file
    created_by   VARCHAR(36),   -- user who made this version
    PRIMARY KEY (file_id, version)
);
 
-- Sharing permissions
CREATE TABLE file_shares (
    file_id      VARCHAR(36),
    shared_with  VARCHAR(36),   -- user_id or group_id
    permission   ENUM('view', 'edit', 'owner'),
    shared_at    TIMESTAMP,
    PRIMARY KEY (file_id, shared_with)
);

File Versioning

Versioning strategy:
  Keep ALL block versions in S3 (never delete blocks in use)
  Metadata DB tracks version history (which blocks = which version)
  User can restore any previous version

Version 1: [blockA_v1, blockB_v1, blockC_v1]    ← all blocks in S3
Version 2: [blockA_v1, blockB_v2, blockC_v1]    ← only blockB changed
Version 3: [blockA_v1, blockB_v2, blockC_v3]    ← only blockC changed

Storage efficiency:
  Instead of 3 full file copies β†’ store unique blocks only
  Blocks shared across versions are stored ONCE
  Version 3 storage = blockA_v1 + blockB_v2 + blockC_v3 (not 3Γ— the file)

Version cleanup (storage management):
  After 30 days: keep max 30 daily versions
  After 1 year: keep max 12 monthly snapshots
  Move old version blocks to S3 Glacier (cheaper)

Notification Service

When to use Long Polling vs WebSocket:

Options:
  1. Long Polling (HTTP)
  2. WebSocket (persistent TCP)
  3. Server-Sent Events (SSE)

Why Long Polling for Drive sync?:

Long Polling:
  Client sends GET /changes?since=<timestamp>
  Server holds request open (up to 30 sec)
  When change happens: server responds immediately
  Client processes change, immediately opens next poll
  On no change: server responds after timeout, client re-polls

Why NOT WebSocket?
  - WebSockets are bidirectional (client also pushes to server)
  - Drive sync is mostly server β†’ client (server tells client "here's a change")
  - WebSocket overhead is higher for one-directional notifications
  - Long polling simpler to implement with existing HTTP infrastructure
  - Load balancers (HAProxy, NGINX) handle HTTP naturally; WebSocket needs config

When to prefer WebSocket:
  - Real-time collaboration (Google Docs: many clients push edits simultaneously)
  - Chat applications (truly bidirectional)
  - Live dashboards with high update frequency

Long polling implementation:

Client: GET /api/changes?user_id=abc&since=1640000000&timeout=30

Server:
  1. Check if any changes since `since` timestamp
  2. If YES: return changes immediately (HTTP 200)
  3. If NO: hold connection, subscribe to change events for this user
  4. When change arrives: return immediately
  5. After 30s timeout with no change: return empty (HTTP 200, [])
  Client receives response β†’ process β†’ immediately re-issue request

Consistency Model

Strong consistency needed for:
  - Metadata (file names, folder structure, versions)
  - Why: User must see their own writes immediately
  - If metadata inconsistent β†’ sync could corrupt file structure
  - Use: MySQL with synchronous replication, read from primary

Eventual consistency acceptable for:
  - File content (block data in S3)
  - Why: Blocks are immutable once written (content-addressable)
  - A block hash uniquely identifies content β€” no conflict possible
  - S3 eventual consistency is fine (block exists or doesn't)
  - Use: S3 standard replication (99.999999999% durability)

Conflict resolution:
  Conflict: Two devices edit same file while one is offline
  Strategy: "last write wins" with user notification
    - Both versions saved as separate versions
    - User sees "Conflicting version" warning
    - User manually merges (like Dropbox "conflicted copy")
  No auto-merge for binary files (photos, PDFs) β€” too complex

Upload Flow with Pre-signed URLs

Upload flow (large file):
1. Client sends POST /upload/init
   Body: { filename, size, sha256_checksum }

2. API Server:
   - Creates file record in MySQL (status=UPLOADING)
   - For each block: check if block_hash exists in blocks table
   - Returns list of blocks that need uploading (others already exist!)
   - Generates pre-signed S3 URL per missing block

3. Client uploads only MISSING blocks directly to S3
   - Parallel uploads (up to 4 simultaneous)
   - Retry on failure (exponential backoff)

4. Client sends POST /upload/complete
   Body: { file_id, block_list: [{ seq, hash }] }

5. API Server:
   - Verifies all block hashes exist in S3
   - Creates file_version record
   - Updates file status β†’ READY
   - Publishes change event to notification service

Flow diagram:
  Client ──POST /upload/init──→ API Server
  API Server ──────────────────────────────→ MySQL (check existing blocks)
  API Server ←──────────────────────────────── missing blocks list
  API Server ──generate presigned URLs──→ S3
  Client ←──── presigned URLs for missing blocks ──── API Server
  Client ──PUT block directly──→ S3  (4 MB per block, parallel)
  Client ──POST /upload/complete──→ API Server
  API Server ──update metadata──→ MySQL
  API Server ──publish change──→ Notification Service
  Other devices receive sync event

Storage Tiering

Hot storage (S3 Standard):
  Current file versions, recently accessed files
  Cost: $0.023/GB/month
  Access time: milliseconds

Warm storage (S3 Standard-IA):
  Files not accessed in 30 days
  Cost: $0.0125/GB/month (46% cheaper)
  Access time: milliseconds (slightly higher retrieval fee)

Cold storage (S3 Glacier Instant Retrieval):
  Old file versions (not current), not accessed in 90+ days
  Cost: $0.004/GB/month (83% cheaper than standard)
  Access time: milliseconds (retrieval cost applies)

Archive (S3 Glacier Deep Archive):
  Very old versions, legal/compliance retention
  Cost: $0.00099/GB/month
  Access time: 12 hours

Lifecycle policy:
  Day 0: Upload β†’ S3 Standard
  Day 30: β†’ S3 Standard-IA (if not accessed)
  Day 90: old versions β†’ S3 Glacier Instant
  Day 365: β†’ S3 Glacier Deep Archive

Design Summary

Final Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                      CLIENT (Desktop/Mobile)                       β”‚
β”‚                                                                    β”‚
β”‚  Watcher β†’ Chunker β†’ Indexer β†’ Sync Engine                        β”‚
β”‚  (detect)   (split)  (hash)    (upload delta, handle conflicts)   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                               β”‚ HTTPS
              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
              β”‚           Load Balancer             β”‚
              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                               β”‚
         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
         β–Ό                     β–Ό                      β–Ό
  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚  API Server β”‚      β”‚ Block Server β”‚      β”‚Notification  β”‚
  β”‚             β”‚      β”‚              β”‚      β”‚   Server     β”‚
  β”‚ - File CRUD β”‚      β”‚ - Split      β”‚      β”‚              β”‚
  β”‚ - Share     β”‚      β”‚ - Hash       β”‚      β”‚ - Long poll  β”‚
  β”‚ - Versions  β”‚      β”‚ - Compress   β”‚      β”‚ - Fan out    β”‚
  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚                   β”‚                      β”‚
         β–Ό                   β–Ό                      β–Ό
  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚  MySQL      β”‚    β”‚  S3          β”‚      β”‚  Client B    β”‚
  β”‚  Metadata   β”‚    β”‚  (blocks)    β”‚      β”‚  (phone,     β”‚
  β”‚  - Files    β”‚    β”‚              β”‚      β”‚   tablet)    β”‚
  β”‚  - Blocks   β”‚    β”‚  S3 Glacier  β”‚      β”‚              β”‚
  β”‚  - Versions β”‚    β”‚  (old vers.) β”‚      β”‚ downloads    β”‚
  β”‚  - Shares   β”‚    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚ changed      β”‚
  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜                          β”‚ blocks only  β”‚
         β”‚                                 β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
  β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”
  β”‚ Redis Cache β”‚
  β”‚ (metadata)  β”‚
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Key Decisions Summary

DecisionChoiceReasoning
File storageBlock-level (4 MB chunks)Delta sync: only upload changed blocks
Block IDSHA256 hash (content-addressable)Deduplication, immutable blocks
DeduplicationBlock hash lookup before uploadStore each block once, huge storage savings
Metadata DBMySQL (strong consistency)ACID for file metadata, version history
Analytics/metricsRedis + MySQLFast cache for hot data
File contentS3 (eventual consistency OK)Immutable blocks: no conflicts
NotificationLong pollingMostly server→client, simpler than WebSocket
Upload methodPre-signed URLs + multipartBlock servers not a bottleneck
Old versionsS3 Glacier83% cost reduction for cold data

Interview Questions & Answers

Q: Why split files into 4 MB blocks instead of uploading the whole file?
A: Delta sync. When a user edits a large file (e.g., modifies one paragraph in a 100 MB document), only one or two 4 MB blocks change. Instead of uploading 100 MB again, only the changed blocks are uploaded β€” potentially 1/25th of the bandwidth. Block-level storage also enables deduplication: if two users have the same file (or a file shares blocks with another), only one copy of each unique block is stored. This saves both bandwidth and storage.

Q: How does block deduplication work? What are the risks?
A: Before uploading a block, the client computes its SHA256 hash and sends it to the API server. The server checks if that hash exists in the blocks table. If it does, the block is already in S3 β€” skip the upload, just record the reference. Risk: hash collision (two different files produce same SHA256) β€” theoretically possible but astronomically unlikely (2^256 space). Risk: timing attack β€” attacker could β€œclaim” ownership of a block they know the hash of. Mitigation: for sensitive files, use client-side encryption (encrypt before hashing, attacker cannot read content even if they know the hash).

Q: Why long polling for the notification service instead of WebSocket?
A: Drive sync is primarily a server-to-client flow β€” the server notifies clients that a change has occurred. WebSocket is optimized for bidirectional communication. Long polling is simpler to implement, works through existing HTTP infrastructure and proxies, doesn’t require special load balancer configuration, and has lower per-connection overhead for infrequent notifications. WebSocket would be preferred for real-time co-editing (Google Docs) where multiple clients are simultaneously pushing changes.

Q: How do you handle file conflicts when two devices edit the same file offline?
A: When both devices reconnect, the system detects a conflict (both have a version newer than the last synced state). Both versions are preserved: the second sync creates a β€œconflicted copy” with a different filename (like Dropbox does: β€œreport (John’s conflicted copy 2026-04-13).docx”). The user is notified and must manually merge. Auto-merge is only feasible for text files where a three-way merge algorithm (like git) can be applied. Binary files (photos, PDFs) cannot be auto-merged.

Q: How do you ensure strong consistency for metadata?
A: Metadata uses MySQL with synchronous replication to a hot standby. All writes go to the primary; reads go to the primary too (or a replica with bounded staleness lag). Redis cache in front of MySQL for hot reads (file listings, recent files). For critical operations (sharing permissions, version creation), always read from MySQL primary to avoid stale cache. Use transactions for multi-table updates (e.g., creating a version + updating latest_version on the file row atomically).

Q: How would you scale to 500 million users?
A: (1) Shard MySQL by user_id β€” each shard handles a subset of users. (2) Federate by user geography (EU data stays in EU for GDPR). (3) Scale block servers horizontally β€” they’re stateless. (4) S3 already scales infinitely. (5) Notification service: use pub/sub (Kafka) β€” each user_id maps to a partition, notification consumers fan out to connected clients. (6) Add read replicas for metadata DB. (7) Add CDN for file downloads (popular shared files cached at edge).


Key Takeaways

  1. Block-level storage (4 MB chunks) enables delta sync β€” the defining optimization for cloud storage
  2. Content-addressable storage (SHA256 block ID) enables deduplication β€” each unique block stored exactly once
  3. Delta sync is the reason Google Drive is usable on mobile β€” only changed blocks transferred
  4. Strong consistency for metadata (MySQL), eventual OK for content (S3 immutable blocks)
  5. Long polling for sync notifications — simpler than WebSocket for mostly server→client flow
  6. Pre-signed URLs keep block servers out of the data path for large file transfers
  7. Storage tiering (Standard β†’ IA β†’ Glacier) is essential for cost management at 500 PB scale
  8. Conflict resolution = preserve both versions + notify user; never silently overwrite data


Practice this design! Common hard interview question. Be ready to:

  1. Explain block storage and why it enables delta sync
  2. Draw the sync flow from Client A edit β†’ Client B receives update
  3. Discuss consistency choices (strong for metadata, eventual for blocks)
  4. Handle conflict resolution and offline sync
  5. Talk through storage tiering for cost optimization

Last Updated: 2026-04-13
Status: Common hard interview question - Must know!