Chapter 15: Design Google Drive (Cloud Storage Service)

volume1 google-drive cloud-storage sync file-system

Status: 🟩 Interview ready
Difficulty: Hard
Time to complete: 45 min read + practice

Overview

Google Drive is a cloud storage service that lets users store, sync, and share files across all their devices. Designing it means solving problems around large file handling, efficient sync (don’t re-upload unchanged data), conflict resolution, and strong reliability.

Why this matters:

Common hard interview question
Covers block storage, delta sync, consistency, conflict resolution
Real-world: Google Drive, Dropbox, iCloud, OneDrive, Box

Problem Statement

Design a cloud storage service that:

Lets users upload/download any file type
Syncs files across all of a user’s devices
Allows file sharing with other users
Keeps revision history (previous versions)
Works reliably with no data loss

Step 1: Requirements & Scope (5 min)

Functional Requirements

Clarifying questions:

What file types? → Any file type (documents, photos, videos, zip files)
Mobile and desktop? → Yes, all platforms
Sharing? → Yes, with read-only or edit permissions
Versioning? → Yes, keep previous revisions
Collaboration (real-time co-edit)? → No, out of scope (focus on sync)
Offline access? → Yes, sync when reconnected

Scope:

Upload and download files from any device
Sync changes across a user’s devices automatically
Share files/folders with specific users
View and restore previous file versions
Conflict detection and resolution

Non-Functional Requirements

Reliability: No data loss. Files must be durable across datacenter failures
Availability: 99.9% uptime (data sync can be slightly delayed, not lost)
Scalability: 50M users, 10M DAU
Performance: Fast sync (delta-only, not full re-upload)
Storage efficiency: Deduplication to reduce storage cost
Consistency: Strong consistency for metadata, eventual for file content

Scale Estimation

Users:
  50M total users, 10M DAU
  Each user: 10 GB free storage
  Total storage: 50M × 10 GB = 500 PB

Read/write ratio:
  ~1:1 (users frequently upload AND download/view)

Upload requests:
  10M DAU × 2 uploads/day avg = 20M uploads/day
  = 231 uploads/second

Metadata operations:
  File listing, search, version history = 10× upload ops
  = 2,300 ops/second

Storage (with deduplication and compression):
  Assume 40% deduplication ratio
  500 PB raw × 0.6 = 300 PB actual storage
  With replication (3×): 900 PB total physical storage

Step 2: High-Level Design (10 min)

Core Flows

Flow 1: File Upload

Client → Block Servers → Cloud Storage (S3)
Client → API Servers  → Metadata DB (MySQL)

Flow 2: File Sync to Other Devices

Client A (uploads) → API Servers → Metadata DB
                                 → Notification Service
                                          ↓
                              Client B (same user, another device)
                                      receives sync notification
                                      downloads only changed blocks

Flow 3: File Download

Client → API Servers → Metadata DB (find block list)
Client → Block Servers / S3 (fetch blocks directly)
Client → Reassemble blocks → file

Component Overview

Component	Purpose
Block Servers	Split files into blocks, compute hashes, compress, upload to S3
Cloud Storage (S3)	Durable object storage for file blocks
Cold Storage (Glacier)	Old file versions not recently accessed
Load Balancers	Distribute API traffic
API Servers	Handle file CRUD, share, version requests
Metadata DB (MySQL)	Files, blocks, users, workspace, sharing info
Metadata Cache (Redis)	Cache frequently accessed file metadata
Notification Service	Push sync events to connected clients
Offline Backup Queue	Queue sync jobs for offline clients

High-Level Architecture Diagram

┌──────────────┐     ┌─────────────────────────────────────────────┐
│   Client A   │     │                API Layer                    │
│  (Laptop)    │     │                                             │
│  - Watcher   │     │  ┌──────────┐  ┌───────────┐  ┌─────────┐ │
│  - Chunker   │────→│  │  Load    │  │   API     │  │ Redis   │ │
│  - Indexer   │     │  │ Balancer │──│  Servers  │──│ Cache   │ │
│  - Sync      │     │  └──────────┘  └─────┬─────┘  └─────────┘ │
└──────────────┘     │                      │                     │
                     └──────────────────────┼─────────────────────┘
                                            │
                    ┌───────────────────────┼───────────────────────┐
                    ▼                       ▼                       ▼
             ┌───────────┐         ┌────────────────┐     ┌──────────────┐
             │  Metadata │         │  Block Servers │     │Notification  │
             │  DB       │         │  (hash, split, │     │  Service     │
             │ (MySQL)   │         │  compress)     │     │(Long Polling)│
             └───────────┘         └───────┬────────┘     └──────┬───────┘
                                           │                     │
                                           ▼                     ▼
                                   ┌───────────────┐    ┌────────────────┐
                                   │  S3 (blocks)  │    │   Client B     │
                                   │               │    │   (Phone)      │
                                   │  Cold Storage │    │  receives sync │
                                   │  (Glacier for │    │  notification  │
                                   │   old vers.)  │    └────────────────┘
                                   └───────────────┘

Step 3: Deep Dive (20 min)

Block Storage Design

Core idea: Instead of uploading whole files, split into fixed-size blocks and only sync changed blocks.

Why Blocks?

Without blocks:
  User edits 1 line in a 100 MB Word doc
  → Must re-upload full 100 MB ❌ (wastes bandwidth)

With blocks (4 MB each):
  100 MB file = 25 blocks
  Edit affects only 1 block
  → Upload only 1 changed block (4 MB) ✅ (25× less bandwidth)

Block Design

Block properties:
  - Fixed size: 4 MB per block (tunable)
  - Content-addressable: block identified by SHA256(content)
  - Compressed: LZ4 or zlib before upload (20-40% size reduction)
  - Encrypted: AES-256 before upload (client-side encryption option)

File → Blocks mapping:
  document.pdf (12 MB)
  → Block A: SHA256=a1b2c3... (4 MB)
  → Block B: SHA256=d4e5f6... (4 MB)
  → Block C: SHA256=g7h8i9... (4 MB)

After edit (middle section changed):
  → Block A: SHA256=a1b2c3... (unchanged, DO NOT re-upload)
  → Block B: SHA256=x9y8z7... (CHANGED, upload this block)
  → Block C: SHA256=g7h8i9... (unchanged, DO NOT re-upload)

Upload: Only 4 MB instead of 12 MB → 3× bandwidth saving

Block Deduplication

Same content, stored once:

User A uploads vacation_photo.jpg (5 MB)
  Block hash: SHA256 = "abc123"
  Stored in S3 at key: blocks/abc123

User B also has same photo (common: shared family photos)
  Block hash: SHA256 = "abc123"  (same!)
  S3 already has blocks/abc123 → DO NOT store again!
  Just add reference in metadata DB: block abc123 → file ref count ++

Storage saving: 1 copy stored, N users reference it

Block deduplication in Metadata DB:

-- blocks table: one row per unique block
CREATE TABLE blocks (
    block_hash   VARCHAR(64) PRIMARY KEY,  -- SHA256
    s3_key       VARCHAR(500) NOT NULL,
    size_bytes   INT,
    ref_count    INT DEFAULT 1,            -- how many files reference this
    created_at   TIMESTAMP
);
 
-- file_blocks: which blocks make up each file version
CREATE TABLE file_blocks (
    file_id      VARCHAR(36),
    version      INT,
    block_seq    INT,    -- position in file
    block_hash   VARCHAR(64),
    PRIMARY KEY (file_id, version, block_seq)
);

Delta Sync

Delta sync = only transfer the changed portion of a file.

Delta sync flow:
  1. Client Watcher detects file_changed event
  2. Client Chunker re-splits file into blocks
  3. Client Indexer computes SHA256 for each block
  4. Client compares new block hashes vs last-known block hashes
  5. Client sends only CHANGED blocks to Block Server
  6. Block Server stores new blocks in S3
  7. Metadata DB updated with new block list for new version

Example:
  report.docx v1: [blockA, blockB, blockC, blockD]
  report.docx v2: [blockA, blockB', blockC, blockD]  (blockB changed)

  Delta upload: Only blockB' (4 MB out of 16 MB) = 75% bandwidth saved

Sync Client Architecture

The client has four components working together:

┌──────────────────────────────────────────────────────────┐
│                    Sync Client                           │
│                                                          │
│  ┌─────────────┐   ┌──────────────┐   ┌──────────────┐  │
│  │   Watcher   │──→│   Chunker    │──→│   Indexer    │  │
│  │             │   │              │   │              │  │
│  │ Monitors    │   │ Splits file  │   │ Computes     │  │
│  │ local file  │   │ into 4MB     │   │ SHA256 per   │  │
│  │ system for  │   │ blocks       │   │ block        │  │
│  │ changes     │   │              │   │ Updates      │  │
│  │ (inotify /  │   │              │   │ local DB     │  │
│  │  FSEvents)  │   │              │   │              │  │
│  └─────────────┘   └──────────────┘   └──────┬───────┘  │
│                                               │          │
│                                       ┌───────▼───────┐  │
│                                       │  Sync Engine  │  │
│                                       │               │  │
│                                       │ Computes diff │  │
│                                       │ (which blocks │  │
│                                       │ changed)      │  │
│                                       │ Uploads delta │  │
│                                       │ Handles retry │  │
│                                       └───────────────┘  │
└──────────────────────────────────────────────────────────┘

Component	Technology	Responsibility
Watcher	inotify (Linux), FSEvents (macOS), ReadDirectoryChangesW (Windows)	Detect file create/modify/delete/rename
Chunker	Custom splitter	Split file into 4 MB blocks at stable boundaries
Indexer	SHA256 hashing	Compute block hashes, maintain local block index DB
Sync Engine	Core logic	Diff old vs new block list, schedule uploads, handle conflicts

Metadata DB Design

-- Users
CREATE TABLE users (
    user_id    VARCHAR(36) PRIMARY KEY,
    email      VARCHAR(255) UNIQUE,
    storage_used_bytes  BIGINT DEFAULT 0,
    storage_limit_bytes BIGINT DEFAULT 10737418240  -- 10 GB
);
 
-- Files and Folders
CREATE TABLE files (
    file_id      VARCHAR(36) PRIMARY KEY,
    owner_id     VARCHAR(36),
    parent_id    VARCHAR(36),   -- parent folder (NULL = root)
    name         VARCHAR(500),
    is_folder    BOOLEAN DEFAULT FALSE,
    latest_version INT DEFAULT 1,
    trashed      BOOLEAN DEFAULT FALSE,
    created_at   TIMESTAMP,
    updated_at   TIMESTAMP,
    INDEX idx_owner_parent (owner_id, parent_id)
);
 
-- File Versions (revision history)
CREATE TABLE file_versions (
    file_id      VARCHAR(36),
    version      INT,
    created_at   TIMESTAMP,
    size_bytes   BIGINT,
    checksum     VARCHAR(64),   -- SHA256 of full file
    created_by   VARCHAR(36),   -- user who made this version
    PRIMARY KEY (file_id, version)
);
 
-- Sharing permissions
CREATE TABLE file_shares (
    file_id      VARCHAR(36),
    shared_with  VARCHAR(36),   -- user_id or group_id
    permission   ENUM('view', 'edit', 'owner'),
    shared_at    TIMESTAMP,
    PRIMARY KEY (file_id, shared_with)
);

File Versioning

Versioning strategy:
  Keep ALL block versions in S3 (never delete blocks in use)
  Metadata DB tracks version history (which blocks = which version)
  User can restore any previous version

Version 1: [blockA_v1, blockB_v1, blockC_v1]    ← all blocks in S3
Version 2: [blockA_v1, blockB_v2, blockC_v1]    ← only blockB changed
Version 3: [blockA_v1, blockB_v2, blockC_v3]    ← only blockC changed

Storage efficiency:
  Instead of 3 full file copies → store unique blocks only
  Blocks shared across versions are stored ONCE
  Version 3 storage = blockA_v1 + blockB_v2 + blockC_v3 (not 3× the file)

Version cleanup (storage management):
  After 30 days: keep max 30 daily versions
  After 1 year: keep max 12 monthly snapshots
  Move old version blocks to S3 Glacier (cheaper)

Notification Service

When to use Long Polling vs WebSocket:

Options:
  1. Long Polling (HTTP)
  2. WebSocket (persistent TCP)
  3. Server-Sent Events (SSE)

Why Long Polling for Drive sync?:

Long Polling:
  Client sends GET /changes?since=<timestamp>
  Server holds request open (up to 30 sec)
  When change happens: server responds immediately
  Client processes change, immediately opens next poll
  On no change: server responds after timeout, client re-polls

Why NOT WebSocket?
  - WebSockets are bidirectional (client also pushes to server)
  - Drive sync is mostly server → client (server tells client "here's a change")
  - WebSocket overhead is higher for one-directional notifications
  - Long polling simpler to implement with existing HTTP infrastructure
  - Load balancers (HAProxy, NGINX) handle HTTP naturally; WebSocket needs config

When to prefer WebSocket:
  - Real-time collaboration (Google Docs: many clients push edits simultaneously)
  - Chat applications (truly bidirectional)
  - Live dashboards with high update frequency

Long polling implementation:

Client: GET /api/changes?user_id=abc&since=1640000000&timeout=30

Server:
  1. Check if any changes since `since` timestamp
  2. If YES: return changes immediately (HTTP 200)
  3. If NO: hold connection, subscribe to change events for this user
  4. When change arrives: return immediately
  5. After 30s timeout with no change: return empty (HTTP 200, [])
  Client receives response → process → immediately re-issue request

Consistency Model

Strong consistency needed for:
  - Metadata (file names, folder structure, versions)
  - Why: User must see their own writes immediately
  - If metadata inconsistent → sync could corrupt file structure
  - Use: MySQL with synchronous replication, read from primary

Eventual consistency acceptable for:
  - File content (block data in S3)
  - Why: Blocks are immutable once written (content-addressable)
  - A block hash uniquely identifies content — no conflict possible
  - S3 eventual consistency is fine (block exists or doesn't)
  - Use: S3 standard replication (99.999999999% durability)

Conflict resolution:
  Conflict: Two devices edit same file while one is offline
  Strategy: "last write wins" with user notification
    - Both versions saved as separate versions
    - User sees "Conflicting version" warning
    - User manually merges (like Dropbox "conflicted copy")
  No auto-merge for binary files (photos, PDFs) — too complex

Upload Flow with Pre-signed URLs

Upload flow (large file):
1. Client sends POST /upload/init
   Body: { filename, size, sha256_checksum }

2. API Server:
   - Creates file record in MySQL (status=UPLOADING)
   - For each block: check if block_hash exists in blocks table
   - Returns list of blocks that need uploading (others already exist!)
   - Generates pre-signed S3 URL per missing block

3. Client uploads only MISSING blocks directly to S3
   - Parallel uploads (up to 4 simultaneous)
   - Retry on failure (exponential backoff)

4. Client sends POST /upload/complete
   Body: { file_id, block_list: [{ seq, hash }] }

5. API Server:
   - Verifies all block hashes exist in S3
   - Creates file_version record
   - Updates file status → READY
   - Publishes change event to notification service

Flow diagram:
  Client ──POST /upload/init──→ API Server
  API Server ──────────────────────────────→ MySQL (check existing blocks)
  API Server ←──────────────────────────────── missing blocks list
  API Server ──generate presigned URLs──→ S3
  Client ←──── presigned URLs for missing blocks ──── API Server
  Client ──PUT block directly──→ S3  (4 MB per block, parallel)
  Client ──POST /upload/complete──→ API Server
  API Server ──update metadata──→ MySQL
  API Server ──publish change──→ Notification Service
  Other devices receive sync event

Storage Tiering

Hot storage (S3 Standard):
  Current file versions, recently accessed files
  Cost: $0.023/GB/month
  Access time: milliseconds

Warm storage (S3 Standard-IA):
  Files not accessed in 30 days
  Cost: $0.0125/GB/month (46% cheaper)
  Access time: milliseconds (slightly higher retrieval fee)

Cold storage (S3 Glacier Instant Retrieval):
  Old file versions (not current), not accessed in 90+ days
  Cost: $0.004/GB/month (83% cheaper than standard)
  Access time: milliseconds (retrieval cost applies)

Archive (S3 Glacier Deep Archive):
  Very old versions, legal/compliance retention
  Cost: $0.00099/GB/month
  Access time: 12 hours

Lifecycle policy:
  Day 0: Upload → S3 Standard
  Day 30: → S3 Standard-IA (if not accessed)
  Day 90: old versions → S3 Glacier Instant
  Day 365: → S3 Glacier Deep Archive

Design Summary

Final Architecture

┌────────────────────────────────────────────────────────────────────┐
│                      CLIENT (Desktop/Mobile)                       │
│                                                                    │
│  Watcher → Chunker → Indexer → Sync Engine                        │
│  (detect)   (split)  (hash)    (upload delta, handle conflicts)   │
└──────────────────────────────┬─────────────────────────────────────┘
                               │ HTTPS
              ┌────────────────▼────────────────────┐
              │           Load Balancer             │
              └────────────────┬────────────────────┘
                               │
         ┌─────────────────────┼──────────────────────┐
         ▼                     ▼                      ▼
  ┌─────────────┐      ┌──────────────┐      ┌──────────────┐
  │  API Server │      │ Block Server │      │Notification  │
  │             │      │              │      │   Server     │
  │ - File CRUD │      │ - Split      │      │              │
  │ - Share     │      │ - Hash       │      │ - Long poll  │
  │ - Versions  │      │ - Compress   │      │ - Fan out    │
  └──────┬──────┘      └──────┬───────┘      └──────┬───────┘
         │                   │                      │
         ▼                   ▼                      ▼
  ┌─────────────┐    ┌──────────────┐      ┌──────────────┐
  │  MySQL      │    │  S3          │      │  Client B    │
  │  Metadata   │    │  (blocks)    │      │  (phone,     │
  │  - Files    │    │              │      │   tablet)    │
  │  - Blocks   │    │  S3 Glacier  │      │              │
  │  - Versions │    │  (old vers.) │      │ downloads    │
  │  - Shares   │    └──────────────┘      │ changed      │
  └──────┬──────┘                          │ blocks only  │
         │                                 └──────────────┘
  ┌──────▼──────┐
  │ Redis Cache │
  │ (metadata)  │
  └─────────────┘

Key Decisions Summary

Decision	Choice	Reasoning
File storage	Block-level (4 MB chunks)	Delta sync: only upload changed blocks
Block ID	SHA256 hash (content-addressable)	Deduplication, immutable blocks
Deduplication	Block hash lookup before upload	Store each block once, huge storage savings
Metadata DB	MySQL (strong consistency)	ACID for file metadata, version history
Analytics/metrics	Redis + MySQL	Fast cache for hot data
File content	S3 (eventual consistency OK)	Immutable blocks: no conflicts
Notification	Long polling	Mostly server→client, simpler than WebSocket
Upload method	Pre-signed URLs + multipart	Block servers not a bottleneck
Old versions	S3 Glacier	83% cost reduction for cold data

Interview Questions & Answers

Q: Why split files into 4 MB blocks instead of uploading the whole file?
A: Delta sync. When a user edits a large file (e.g., modifies one paragraph in a 100 MB document), only one or two 4 MB blocks change. Instead of uploading 100 MB again, only the changed blocks are uploaded — potentially 1/25th of the bandwidth. Block-level storage also enables deduplication: if two users have the same file (or a file shares blocks with another), only one copy of each unique block is stored. This saves both bandwidth and storage.

Q: How does block deduplication work? What are the risks?
A: Before uploading a block, the client computes its SHA256 hash and sends it to the API server. The server checks if that hash exists in the blocks table. If it does, the block is already in S3 — skip the upload, just record the reference. Risk: hash collision (two different files produce same SHA256) — theoretically possible but astronomically unlikely (2^256 space). Risk: timing attack — attacker could “claim” ownership of a block they know the hash of. Mitigation: for sensitive files, use client-side encryption (encrypt before hashing, attacker cannot read content even if they know the hash).

Q: Why long polling for the notification service instead of WebSocket?
A: Drive sync is primarily a server-to-client flow — the server notifies clients that a change has occurred. WebSocket is optimized for bidirectional communication. Long polling is simpler to implement, works through existing HTTP infrastructure and proxies, doesn’t require special load balancer configuration, and has lower per-connection overhead for infrequent notifications. WebSocket would be preferred for real-time co-editing (Google Docs) where multiple clients are simultaneously pushing changes.

Q: How do you handle file conflicts when two devices edit the same file offline?
A: When both devices reconnect, the system detects a conflict (both have a version newer than the last synced state). Both versions are preserved: the second sync creates a “conflicted copy” with a different filename (like Dropbox does: “report (John’s conflicted copy 2026-04-13).docx”). The user is notified and must manually merge. Auto-merge is only feasible for text files where a three-way merge algorithm (like git) can be applied. Binary files (photos, PDFs) cannot be auto-merged.

Q: How do you ensure strong consistency for metadata?
A: Metadata uses MySQL with synchronous replication to a hot standby. All writes go to the primary; reads go to the primary too (or a replica with bounded staleness lag). Redis cache in front of MySQL for hot reads (file listings, recent files). For critical operations (sharing permissions, version creation), always read from MySQL primary to avoid stale cache. Use transactions for multi-table updates (e.g., creating a version + updating latest_version on the file row atomically).

Q: How would you scale to 500 million users?
A: (1) Shard MySQL by user_id — each shard handles a subset of users. (2) Federate by user geography (EU data stays in EU for GDPR). (3) Scale block servers horizontally — they’re stateless. (4) S3 already scales infinitely. (5) Notification service: use pub/sub (Kafka) — each user_id maps to a partition, notification consumers fan out to connected clients. (6) Add read replicas for metadata DB. (7) Add CDN for file downloads (popular shared files cached at edge).

Key Takeaways

Block-level storage (4 MB chunks) enables delta sync — the defining optimization for cloud storage
Content-addressable storage (SHA256 block ID) enables deduplication — each unique block stored exactly once
Delta sync is the reason Google Drive is usable on mobile — only changed blocks transferred
Strong consistency for metadata (MySQL), eventual OK for content (S3 immutable blocks)
Long polling for sync notifications — simpler than WebSocket for mostly server→client flow
Pre-signed URLs keep block servers out of the data path for large file transfers
Storage tiering (Standard → IA → Glacier) is essential for cost management at 500 PB scale
Conflict resolution = preserve both versions + notify user; never silently overwrite data

distributed-system-components - Blob storage, message queues, CDN
key-patterns - Content-addressable storage, delta sync, deduplication
ch04-rate-limiter - Rate limit upload API
ch14-youtube - Similar blob storage and CDN patterns

Practice this design! Common hard interview question. Be ready to:

Explain block storage and why it enables delta sync
Draw the sync flow from Client A edit → Client B receives update
Discuss consistency choices (strong for metadata, eventual for blocks)
Handle conflict resolution and offline sync
Talk through storage tiering for cost optimization

Last Updated: 2026-04-13
Status: Common hard interview question - Must know!

Study Notes by Niladri & AI

Explorer

ch15-google-drive

Chapter 15: Design Google Drive (Cloud Storage Service)

Overview

Problem Statement

Step 1: Requirements & Scope (5 min)

Functional Requirements

Non-Functional Requirements

Scale Estimation

Step 2: High-Level Design (10 min)

Core Flows

Component Overview

High-Level Architecture Diagram

Step 3: Deep Dive (20 min)

Block Storage Design

Why Blocks?

Block Design

Block Deduplication

Delta Sync

Sync Client Architecture

Metadata DB Design

File Versioning

Notification Service

Consistency Model

Upload Flow with Pre-signed URLs

Storage Tiering

Design Summary

Final Architecture

Key Decisions Summary

Interview Questions & Answers

Key Takeaways

Graph View

Table of Contents

Backlinks

Study Notes by Niladri & AI

Explorer

ch15-google-drive

Chapter 15: Design Google Drive (Cloud Storage Service)

Overview

Problem Statement

Step 1: Requirements & Scope (5 min)

Functional Requirements

Non-Functional Requirements

Scale Estimation

Step 2: High-Level Design (10 min)

Core Flows

Component Overview

High-Level Architecture Diagram

Step 3: Deep Dive (20 min)

Block Storage Design

Why Blocks?

Block Design

Block Deduplication

Delta Sync

Sync Client Architecture

Metadata DB Design

File Versioning

Notification Service

Consistency Model

Upload Flow with Pre-signed URLs

Storage Tiering

Design Summary

Final Architecture

Key Decisions Summary

Interview Questions & Answers

Key Takeaways

Related Resources

Graph View

Table of Contents

Backlinks