Chapter 5 Designing Dropbox Flashcards
1
Q
Requirements
A
- Upload files from any device
- Share with other users
- Automatice sync between devices
- Large Files
- ACID compliant
- Offline operations
Extended:
Snapshots
2
Q
Design considerations
A
- Huge write and read volume
- Read and write ratio is expected to be the same
- Internally files can be stored as chunks of fixed size.
- Reduce data traffic by updating chunks
- Dedup the chunks
- Keeping a copy of metadata with client can save a lot of round trips to the server
- For small changes, client can just upload the diff.
3
Q
Capacity Estimation
A
- 500M users, 100M daily active
- Each user uses 3 different devices
- about 100 billion files.
- 10 PB of storage
- 1 Million active connections per min
4
Q
High level design
A
- Block server
- Metadata server
- Synchronization server
5
Q
Client design
A
- Metadata saved in client as well.
- Files are chunked based on average file size, cloud file block size, IOPS etc.
- Only modified chunks are synced to backend.
- Metadata changes are synced to sync server.
- Updates from sync server are applied.
6/ Internal metadata database. - Chunker.
- Watcher
- Indexer
6
Q
Metadata Database
A
- Chunks
- Files
- User
- Devices
- Workspace
7
Q
Sync service
A
- Metadata synchronization from clients.
- Pushing updates to clients.
- Message queue
- Dedup blocks
- Sync blocks
8
Q
Message queueing service
A
- Request queue
2. Response Queue
9
Q
Storage service
A
Save chunks in cloud
10
Q
File processing workflow
A
- A uploads chunks to cloud.
- Updates metadata and commits changes
- Notification are sent to B and C
- B and C receive metadata changes and download chunks
11
Q
Deduplication
A
- Post process deduplication
2. Inline deduplication
12
Q
Metadata partitioning
A
Use consistent hashing
13
Q
Caching
A
- Hot file/chunks - Block storage cache
2. Metadata cache
14
Q
Load balancer
A
- Between clients and block servers
2. Between clients and metadata servers.
15
Q
Security
A
- Store metadata with permissions