Distributed Data Flashcards
1
Q
Read-after-write
A
“Read-after-write” is a strategy to get around eventual consistency in a distributed db.
Imagine you post something to a forum, you refresh the page, and don’t see it. This is eventual consistency happening.
Instead, when a client writes a record, the primary returns the record (and version #) in the response. When the client reads again, it passes this version #, and any secondary will only reply if it has that # or higher.
2
Q
Reasons for SQL vs NoSQL
Sample data for NoSQL
A
SQL:
- Structured data
- Strict schema
- Transactions
- Complex joins
- Clear patterns for scaling
- More established: developers, community, code, etc
- Index lookups are very fast
NoSQL:
- Semi-structured data
- Dynamic or flexible schema
- Non-relational data
- No need for complex joins
- Store many TB/PB of data
- Very data intensive workload
- Very high IOPS throughput
Sample data for NoSQL:
- Rapid ingest of clickstream or log data
- Leaderboard / scoring data
- Temporary data such as shopping cart
- Frequently accessed (“hot”) tables
- Metadata/lookup tables