Chapter 5 Designing Dropbox Flashcards

1
Q

Requirements

A
  1. Upload files from any device
  2. Share with other users
  3. Automatice sync between devices
  4. Large Files
  5. ACID compliant
  6. Offline operations

Extended:
Snapshots

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Design considerations

A
  1. Huge write and read volume
  2. Read and write ratio is expected to be the same
  3. Internally files can be stored as chunks of fixed size.
  4. Reduce data traffic by updating chunks
  5. Dedup the chunks
  6. Keeping a copy of metadata with client can save a lot of round trips to the server
  7. For small changes, client can just upload the diff.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Capacity Estimation

A
  1. 500M users, 100M daily active
  2. Each user uses 3 different devices
  3. about 100 billion files.
  4. 10 PB of storage
  5. 1 Million active connections per min
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

High level design

A
  1. Block server
  2. Metadata server
  3. Synchronization server
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Client design

A
  1. Metadata saved in client as well.
  2. Files are chunked based on average file size, cloud file block size, IOPS etc.
  3. Only modified chunks are synced to backend.
  4. Metadata changes are synced to sync server.
  5. Updates from sync server are applied.
    6/ Internal metadata database.
  6. Chunker.
  7. Watcher
  8. Indexer
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Metadata Database

A
  1. Chunks
  2. Files
  3. User
  4. Devices
  5. Workspace
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Sync service

A
  1. Metadata synchronization from clients.
  2. Pushing updates to clients.
  3. Message queue
  4. Dedup blocks
  5. Sync blocks
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Message queueing service

A
  1. Request queue

2. Response Queue

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Storage service

A

Save chunks in cloud

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

File processing workflow

A
  1. A uploads chunks to cloud.
  2. Updates metadata and commits changes
  3. Notification are sent to B and C
  4. B and C receive metadata changes and download chunks
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Deduplication

A
  1. Post process deduplication

2. Inline deduplication

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Metadata partitioning

A

Use consistent hashing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Caching

A
  1. Hot file/chunks - Block storage cache

2. Metadata cache

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Load balancer

A
  1. Between clients and block servers

2. Between clients and metadata servers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Security

A
  1. Store metadata with permissions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly