Chapter 15: Design Google Drive (Cloud Storage Service)
volume1 google-drive cloud-storage sync file-system
Status: π© Interview ready
Difficulty: Hard
Time to complete: 45 min read + practice
Overview
Google Drive is a cloud storage service that lets users store, sync, and share files across all their devices. Designing it means solving problems around large file handling, efficient sync (donβt re-upload unchanged data), conflict resolution, and strong reliability.
Why this matters:
- Common hard interview question
- Covers block storage, delta sync, consistency, conflict resolution
- Real-world: Google Drive, Dropbox, iCloud, OneDrive, Box
Problem Statement
Design a cloud storage service that:
- Lets users upload/download any file type
- Syncs files across all of a userβs devices
- Allows file sharing with other users
- Keeps revision history (previous versions)
- Works reliably with no data loss
Step 1: Requirements & Scope (5 min)
Functional Requirements
Clarifying questions:
- What file types? β Any file type (documents, photos, videos, zip files)
- Mobile and desktop? β Yes, all platforms
- Sharing? β Yes, with read-only or edit permissions
- Versioning? β Yes, keep previous revisions
- Collaboration (real-time co-edit)? β No, out of scope (focus on sync)
- Offline access? β Yes, sync when reconnected
Scope:
- Upload and download files from any device
- Sync changes across a userβs devices automatically
- Share files/folders with specific users
- View and restore previous file versions
- Conflict detection and resolution
Non-Functional Requirements
- Reliability: No data loss. Files must be durable across datacenter failures
- Availability: 99.9% uptime (data sync can be slightly delayed, not lost)
- Scalability: 50M users, 10M DAU
- Performance: Fast sync (delta-only, not full re-upload)
- Storage efficiency: Deduplication to reduce storage cost
- Consistency: Strong consistency for metadata, eventual for file content
Scale Estimation
Users:
50M total users, 10M DAU
Each user: 10 GB free storage
Total storage: 50M Γ 10 GB = 500 PB
Read/write ratio:
~1:1 (users frequently upload AND download/view)
Upload requests:
10M DAU Γ 2 uploads/day avg = 20M uploads/day
= 231 uploads/second
Metadata operations:
File listing, search, version history = 10Γ upload ops
= 2,300 ops/second
Storage (with deduplication and compression):
Assume 40% deduplication ratio
500 PB raw Γ 0.6 = 300 PB actual storage
With replication (3Γ): 900 PB total physical storage
Step 2: High-Level Design (10 min)
Core Flows
Flow 1: File Upload
Client β Block Servers β Cloud Storage (S3)
Client β API Servers β Metadata DB (MySQL)
Flow 2: File Sync to Other Devices
Client A (uploads) β API Servers β Metadata DB
β Notification Service
β
Client B (same user, another device)
receives sync notification
downloads only changed blocks
Flow 3: File Download
Client β API Servers β Metadata DB (find block list)
Client β Block Servers / S3 (fetch blocks directly)
Client β Reassemble blocks β file
Component Overview
| Component | Purpose |
|---|---|
| Block Servers | Split files into blocks, compute hashes, compress, upload to S3 |
| Cloud Storage (S3) | Durable object storage for file blocks |
| Cold Storage (Glacier) | Old file versions not recently accessed |
| Load Balancers | Distribute API traffic |
| API Servers | Handle file CRUD, share, version requests |
| Metadata DB (MySQL) | Files, blocks, users, workspace, sharing info |
| Metadata Cache (Redis) | Cache frequently accessed file metadata |
| Notification Service | Push sync events to connected clients |
| Offline Backup Queue | Queue sync jobs for offline clients |
High-Level Architecture Diagram
ββββββββββββββββ βββββββββββββββββββββββββββββββββββββββββββββββ
β Client A β β API Layer β
β (Laptop) β β β
β - Watcher β β ββββββββββββ βββββββββββββ βββββββββββ β
β - Chunker βββββββ β Load β β API β β Redis β β
β - Indexer β β β Balancer ββββ Servers ββββ Cache β β
β - Sync β β ββββββββββββ βββββββ¬ββββββ βββββββββββ β
ββββββββββββββββ β β β
ββββββββββββββββββββββββΌββββββββββββββββββββββ
β
βββββββββββββββββββββββββΌββββββββββββββββββββββββ
βΌ βΌ βΌ
βββββββββββββ ββββββββββββββββββ ββββββββββββββββ
β Metadata β β Block Servers β βNotification β
β DB β β (hash, split, β β Service β
β (MySQL) β β compress) β β(Long Polling)β
βββββββββββββ βββββββββ¬βββββββββ ββββββββ¬ββββββββ
β β
βΌ βΌ
βββββββββββββββββ ββββββββββββββββββ
β S3 (blocks) β β Client B β
β β β (Phone) β
β Cold Storage β β receives sync β
β (Glacier for β β notification β
β old vers.) β ββββββββββββββββββ
βββββββββββββββββ
Step 3: Deep Dive (20 min)
Block Storage Design
Core idea: Instead of uploading whole files, split into fixed-size blocks and only sync changed blocks.
Why Blocks?
Without blocks:
User edits 1 line in a 100 MB Word doc
β Must re-upload full 100 MB β (wastes bandwidth)
With blocks (4 MB each):
100 MB file = 25 blocks
Edit affects only 1 block
β Upload only 1 changed block (4 MB) β
(25Γ less bandwidth)
Block Design
Block properties:
- Fixed size: 4 MB per block (tunable)
- Content-addressable: block identified by SHA256(content)
- Compressed: LZ4 or zlib before upload (20-40% size reduction)
- Encrypted: AES-256 before upload (client-side encryption option)
File β Blocks mapping:
document.pdf (12 MB)
β Block A: SHA256=a1b2c3... (4 MB)
β Block B: SHA256=d4e5f6... (4 MB)
β Block C: SHA256=g7h8i9... (4 MB)
After edit (middle section changed):
β Block A: SHA256=a1b2c3... (unchanged, DO NOT re-upload)
β Block B: SHA256=x9y8z7... (CHANGED, upload this block)
β Block C: SHA256=g7h8i9... (unchanged, DO NOT re-upload)
Upload: Only 4 MB instead of 12 MB β 3Γ bandwidth saving
Block Deduplication
Same content, stored once:
User A uploads vacation_photo.jpg (5 MB)
Block hash: SHA256 = "abc123"
Stored in S3 at key: blocks/abc123
User B also has same photo (common: shared family photos)
Block hash: SHA256 = "abc123" (same!)
S3 already has blocks/abc123 β DO NOT store again!
Just add reference in metadata DB: block abc123 β file ref count ++
Storage saving: 1 copy stored, N users reference it
Block deduplication in Metadata DB:
-- blocks table: one row per unique block
CREATE TABLE blocks (
block_hash VARCHAR(64) PRIMARY KEY, -- SHA256
s3_key VARCHAR(500) NOT NULL,
size_bytes INT,
ref_count INT DEFAULT 1, -- how many files reference this
created_at TIMESTAMP
);
-- file_blocks: which blocks make up each file version
CREATE TABLE file_blocks (
file_id VARCHAR(36),
version INT,
block_seq INT, -- position in file
block_hash VARCHAR(64),
PRIMARY KEY (file_id, version, block_seq)
);Delta Sync
Delta sync = only transfer the changed portion of a file.
Delta sync flow:
1. Client Watcher detects file_changed event
2. Client Chunker re-splits file into blocks
3. Client Indexer computes SHA256 for each block
4. Client compares new block hashes vs last-known block hashes
5. Client sends only CHANGED blocks to Block Server
6. Block Server stores new blocks in S3
7. Metadata DB updated with new block list for new version
Example:
report.docx v1: [blockA, blockB, blockC, blockD]
report.docx v2: [blockA, blockB', blockC, blockD] (blockB changed)
Delta upload: Only blockB' (4 MB out of 16 MB) = 75% bandwidth saved
Sync Client Architecture
The client has four components working together:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Sync Client β
β β
β βββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β Watcher βββββ Chunker βββββ Indexer β β
β β β β β β β β
β β Monitors β β Splits file β β Computes β β
β β local file β β into 4MB β β SHA256 per β β
β β system for β β blocks β β block β β
β β changes β β β β Updates β β
β β (inotify / β β β β local DB β β
β β FSEvents) β β β β β β
β βββββββββββββββ ββββββββββββββββ ββββββββ¬ββββββββ β
β β β
β βββββββββΌββββββββ β
β β Sync Engine β β
β β β β
β β Computes diff β β
β β (which blocks β β
β β changed) β β
β β Uploads delta β β
β β Handles retry β β
β βββββββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
| Component | Technology | Responsibility |
|---|---|---|
| Watcher | inotify (Linux), FSEvents (macOS), ReadDirectoryChangesW (Windows) | Detect file create/modify/delete/rename |
| Chunker | Custom splitter | Split file into 4 MB blocks at stable boundaries |
| Indexer | SHA256 hashing | Compute block hashes, maintain local block index DB |
| Sync Engine | Core logic | Diff old vs new block list, schedule uploads, handle conflicts |
Metadata DB Design
-- Users
CREATE TABLE users (
user_id VARCHAR(36) PRIMARY KEY,
email VARCHAR(255) UNIQUE,
storage_used_bytes BIGINT DEFAULT 0,
storage_limit_bytes BIGINT DEFAULT 10737418240 -- 10 GB
);
-- Files and Folders
CREATE TABLE files (
file_id VARCHAR(36) PRIMARY KEY,
owner_id VARCHAR(36),
parent_id VARCHAR(36), -- parent folder (NULL = root)
name VARCHAR(500),
is_folder BOOLEAN DEFAULT FALSE,
latest_version INT DEFAULT 1,
trashed BOOLEAN DEFAULT FALSE,
created_at TIMESTAMP,
updated_at TIMESTAMP,
INDEX idx_owner_parent (owner_id, parent_id)
);
-- File Versions (revision history)
CREATE TABLE file_versions (
file_id VARCHAR(36),
version INT,
created_at TIMESTAMP,
size_bytes BIGINT,
checksum VARCHAR(64), -- SHA256 of full file
created_by VARCHAR(36), -- user who made this version
PRIMARY KEY (file_id, version)
);
-- Sharing permissions
CREATE TABLE file_shares (
file_id VARCHAR(36),
shared_with VARCHAR(36), -- user_id or group_id
permission ENUM('view', 'edit', 'owner'),
shared_at TIMESTAMP,
PRIMARY KEY (file_id, shared_with)
);File Versioning
Versioning strategy:
Keep ALL block versions in S3 (never delete blocks in use)
Metadata DB tracks version history (which blocks = which version)
User can restore any previous version
Version 1: [blockA_v1, blockB_v1, blockC_v1] β all blocks in S3
Version 2: [blockA_v1, blockB_v2, blockC_v1] β only blockB changed
Version 3: [blockA_v1, blockB_v2, blockC_v3] β only blockC changed
Storage efficiency:
Instead of 3 full file copies β store unique blocks only
Blocks shared across versions are stored ONCE
Version 3 storage = blockA_v1 + blockB_v2 + blockC_v3 (not 3Γ the file)
Version cleanup (storage management):
After 30 days: keep max 30 daily versions
After 1 year: keep max 12 monthly snapshots
Move old version blocks to S3 Glacier (cheaper)
Notification Service
When to use Long Polling vs WebSocket:
Options:
1. Long Polling (HTTP)
2. WebSocket (persistent TCP)
3. Server-Sent Events (SSE)
Why Long Polling for Drive sync?:
Long Polling:
Client sends GET /changes?since=<timestamp>
Server holds request open (up to 30 sec)
When change happens: server responds immediately
Client processes change, immediately opens next poll
On no change: server responds after timeout, client re-polls
Why NOT WebSocket?
- WebSockets are bidirectional (client also pushes to server)
- Drive sync is mostly server β client (server tells client "here's a change")
- WebSocket overhead is higher for one-directional notifications
- Long polling simpler to implement with existing HTTP infrastructure
- Load balancers (HAProxy, NGINX) handle HTTP naturally; WebSocket needs config
When to prefer WebSocket:
- Real-time collaboration (Google Docs: many clients push edits simultaneously)
- Chat applications (truly bidirectional)
- Live dashboards with high update frequency
Long polling implementation:
Client: GET /api/changes?user_id=abc&since=1640000000&timeout=30
Server:
1. Check if any changes since `since` timestamp
2. If YES: return changes immediately (HTTP 200)
3. If NO: hold connection, subscribe to change events for this user
4. When change arrives: return immediately
5. After 30s timeout with no change: return empty (HTTP 200, [])
Client receives response β process β immediately re-issue request
Consistency Model
Strong consistency needed for:
- Metadata (file names, folder structure, versions)
- Why: User must see their own writes immediately
- If metadata inconsistent β sync could corrupt file structure
- Use: MySQL with synchronous replication, read from primary
Eventual consistency acceptable for:
- File content (block data in S3)
- Why: Blocks are immutable once written (content-addressable)
- A block hash uniquely identifies content β no conflict possible
- S3 eventual consistency is fine (block exists or doesn't)
- Use: S3 standard replication (99.999999999% durability)
Conflict resolution:
Conflict: Two devices edit same file while one is offline
Strategy: "last write wins" with user notification
- Both versions saved as separate versions
- User sees "Conflicting version" warning
- User manually merges (like Dropbox "conflicted copy")
No auto-merge for binary files (photos, PDFs) β too complex
Upload Flow with Pre-signed URLs
Upload flow (large file):
1. Client sends POST /upload/init
Body: { filename, size, sha256_checksum }
2. API Server:
- Creates file record in MySQL (status=UPLOADING)
- For each block: check if block_hash exists in blocks table
- Returns list of blocks that need uploading (others already exist!)
- Generates pre-signed S3 URL per missing block
3. Client uploads only MISSING blocks directly to S3
- Parallel uploads (up to 4 simultaneous)
- Retry on failure (exponential backoff)
4. Client sends POST /upload/complete
Body: { file_id, block_list: [{ seq, hash }] }
5. API Server:
- Verifies all block hashes exist in S3
- Creates file_version record
- Updates file status β READY
- Publishes change event to notification service
Flow diagram:
Client ββPOST /upload/initβββ API Server
API Server βββββββββββββββββββββββββββββββ MySQL (check existing blocks)
API Server βββββββββββββββββββββββββββββββββ missing blocks list
API Server ββgenerate presigned URLsβββ S3
Client βββββ presigned URLs for missing blocks ββββ API Server
Client ββPUT block directlyβββ S3 (4 MB per block, parallel)
Client ββPOST /upload/completeβββ API Server
API Server ββupdate metadataβββ MySQL
API Server ββpublish changeβββ Notification Service
Other devices receive sync event
Storage Tiering
Hot storage (S3 Standard):
Current file versions, recently accessed files
Cost: $0.023/GB/month
Access time: milliseconds
Warm storage (S3 Standard-IA):
Files not accessed in 30 days
Cost: $0.0125/GB/month (46% cheaper)
Access time: milliseconds (slightly higher retrieval fee)
Cold storage (S3 Glacier Instant Retrieval):
Old file versions (not current), not accessed in 90+ days
Cost: $0.004/GB/month (83% cheaper than standard)
Access time: milliseconds (retrieval cost applies)
Archive (S3 Glacier Deep Archive):
Very old versions, legal/compliance retention
Cost: $0.00099/GB/month
Access time: 12 hours
Lifecycle policy:
Day 0: Upload β S3 Standard
Day 30: β S3 Standard-IA (if not accessed)
Day 90: old versions β S3 Glacier Instant
Day 365: β S3 Glacier Deep Archive
Design Summary
Final Architecture
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β CLIENT (Desktop/Mobile) β
β β
β Watcher β Chunker β Indexer β Sync Engine β
β (detect) (split) (hash) (upload delta, handle conflicts) β
ββββββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββ
β HTTPS
ββββββββββββββββββΌβββββββββββββββββββββ
β Load Balancer β
ββββββββββββββββββ¬βββββββββββββββββββββ
β
βββββββββββββββββββββββΌβββββββββββββββββββββββ
βΌ βΌ βΌ
βββββββββββββββ ββββββββββββββββ ββββββββββββββββ
β API Server β β Block Server β βNotification β
β β β β β Server β
β - File CRUD β β - Split β β β
β - Share β β - Hash β β - Long poll β
β - Versions β β - Compress β β - Fan out β
ββββββββ¬βββββββ ββββββββ¬ββββββββ ββββββββ¬ββββββββ
β β β
βΌ βΌ βΌ
βββββββββββββββ ββββββββββββββββ ββββββββββββββββ
β MySQL β β S3 β β Client B β
β Metadata β β (blocks) β β (phone, β
β - Files β β β β tablet) β
β - Blocks β β S3 Glacier β β β
β - Versions β β (old vers.) β β downloads β
β - Shares β ββββββββββββββββ β changed β
ββββββββ¬βββββββ β blocks only β
β ββββββββββββββββ
ββββββββΌβββββββ
β Redis Cache β
β (metadata) β
βββββββββββββββ
Key Decisions Summary
| Decision | Choice | Reasoning |
|---|---|---|
| File storage | Block-level (4 MB chunks) | Delta sync: only upload changed blocks |
| Block ID | SHA256 hash (content-addressable) | Deduplication, immutable blocks |
| Deduplication | Block hash lookup before upload | Store each block once, huge storage savings |
| Metadata DB | MySQL (strong consistency) | ACID for file metadata, version history |
| Analytics/metrics | Redis + MySQL | Fast cache for hot data |
| File content | S3 (eventual consistency OK) | Immutable blocks: no conflicts |
| Notification | Long polling | Mostly serverβclient, simpler than WebSocket |
| Upload method | Pre-signed URLs + multipart | Block servers not a bottleneck |
| Old versions | S3 Glacier | 83% cost reduction for cold data |
Interview Questions & Answers
Q: Why split files into 4 MB blocks instead of uploading the whole file?
A: Delta sync. When a user edits a large file (e.g., modifies one paragraph in a 100 MB document), only one or two 4 MB blocks change. Instead of uploading 100 MB again, only the changed blocks are uploaded β potentially 1/25th of the bandwidth. Block-level storage also enables deduplication: if two users have the same file (or a file shares blocks with another), only one copy of each unique block is stored. This saves both bandwidth and storage.
Q: How does block deduplication work? What are the risks?
A: Before uploading a block, the client computes its SHA256 hash and sends it to the API server. The server checks if that hash exists in the blocks table. If it does, the block is already in S3 β skip the upload, just record the reference. Risk: hash collision (two different files produce same SHA256) β theoretically possible but astronomically unlikely (2^256 space). Risk: timing attack β attacker could βclaimβ ownership of a block they know the hash of. Mitigation: for sensitive files, use client-side encryption (encrypt before hashing, attacker cannot read content even if they know the hash).
Q: Why long polling for the notification service instead of WebSocket?
A: Drive sync is primarily a server-to-client flow β the server notifies clients that a change has occurred. WebSocket is optimized for bidirectional communication. Long polling is simpler to implement, works through existing HTTP infrastructure and proxies, doesnβt require special load balancer configuration, and has lower per-connection overhead for infrequent notifications. WebSocket would be preferred for real-time co-editing (Google Docs) where multiple clients are simultaneously pushing changes.
Q: How do you handle file conflicts when two devices edit the same file offline?
A: When both devices reconnect, the system detects a conflict (both have a version newer than the last synced state). Both versions are preserved: the second sync creates a βconflicted copyβ with a different filename (like Dropbox does: βreport (Johnβs conflicted copy 2026-04-13).docxβ). The user is notified and must manually merge. Auto-merge is only feasible for text files where a three-way merge algorithm (like git) can be applied. Binary files (photos, PDFs) cannot be auto-merged.
Q: How do you ensure strong consistency for metadata?
A: Metadata uses MySQL with synchronous replication to a hot standby. All writes go to the primary; reads go to the primary too (or a replica with bounded staleness lag). Redis cache in front of MySQL for hot reads (file listings, recent files). For critical operations (sharing permissions, version creation), always read from MySQL primary to avoid stale cache. Use transactions for multi-table updates (e.g., creating a version + updating latest_version on the file row atomically).
Q: How would you scale to 500 million users?
A: (1) Shard MySQL by user_id β each shard handles a subset of users. (2) Federate by user geography (EU data stays in EU for GDPR). (3) Scale block servers horizontally β theyβre stateless. (4) S3 already scales infinitely. (5) Notification service: use pub/sub (Kafka) β each user_id maps to a partition, notification consumers fan out to connected clients. (6) Add read replicas for metadata DB. (7) Add CDN for file downloads (popular shared files cached at edge).
Key Takeaways
- Block-level storage (4 MB chunks) enables delta sync β the defining optimization for cloud storage
- Content-addressable storage (SHA256 block ID) enables deduplication β each unique block stored exactly once
- Delta sync is the reason Google Drive is usable on mobile β only changed blocks transferred
- Strong consistency for metadata (MySQL), eventual OK for content (S3 immutable blocks)
- Long polling for sync notifications β simpler than WebSocket for mostly serverβclient flow
- Pre-signed URLs keep block servers out of the data path for large file transfers
- Storage tiering (Standard β IA β Glacier) is essential for cost management at 500 PB scale
- Conflict resolution = preserve both versions + notify user; never silently overwrite data
Related Resources
- distributed-system-components - Blob storage, message queues, CDN
- key-patterns - Content-addressable storage, delta sync, deduplication
- ch04-rate-limiter - Rate limit upload API
- ch14-youtube - Similar blob storage and CDN patterns
Practice this design! Common hard interview question. Be ready to:
- Explain block storage and why it enables delta sync
- Draw the sync flow from Client A edit β Client B receives update
- Discuss consistency choices (strong for metadata, eventual for blocks)
- Handle conflict resolution and offline sync
- Talk through storage tiering for cost optimization
Last Updated: 2026-04-13
Status: Common hard interview question - Must know!