Distributed Data Flashcards

Question 1

Q

Read-after-write

Answer

A

“Read-after-write” is a strategy to get around eventual consistency in a distributed db.

Imagine you post something to a forum, you refresh the page, and don’t see it. This is eventual consistency happening.

Instead, when a client writes a record, the primary returns the record (and version #) in the response. When the client reads again, it passes this version #, and any secondary will only reply if it has that # or higher.

Question 2

Q

Reasons for SQL vs NoSQL

Sample data for NoSQL

Answer

A

SQL:

Structured data
Strict schema
Transactions
Complex joins
Clear patterns for scaling
More established: developers, community, code, etc
Index lookups are very fast

NoSQL:

Semi-structured data
Dynamic or flexible schema
Non-relational data
No need for complex joins
Store many TB/PB of data
Very data intensive workload
Very high IOPS throughput

Sample data for NoSQL:

Rapid ingest of clickstream or log data
Leaderboard / scoring data
Temporary data such as shopping cart
Frequently accessed (“hot”) tables
Metadata/lookup tables

Distributed Data Flashcards

(2 cards)