concepts Flashcards
CAP theorem
Consistency
Availability
Partition tolerance
Only two can be guaranteed
ACID property
Atomicity
Consistency
Isolation
Durability
File system distribute/replicate
- Replicate at block level or file level
- Gluster FS is an e.g. of a distributed FS
- File can change master if user travelled
- collaboration like version control systems
- gluster for e.g. gives option to either be consistent OR partition tolerant (in latter case there is merge when connectivity restores)
- Another nice medium ground is three replica setup where if 2 partitions can still talk their data takes precendence over single node’s data (if conflict)
What is data striping?
When processing device requires data faster from storage, spread segments of storage to different physical devices and parallelize reads.
Throughput and latency connected?
Pipe analogy. Width of pipe defines throughput, length defines latency. Throughput limits indirectly affect latency because of addition of ‘queueing delay’ (Entry into the pipe)
Disaggregated compute
Split apart storage and compute layers to scale them separately. Reduces wasted resources of one over the other if tightly bound.
Erasure coded pools?
Data storage technique in distributed environments. Replicate chunks with a factor to be resilient to failures.. HDFS like.
BASE properties of system?
For NoSQL based systems – Basically available, soft state, eventual consistent
Shared nothing arch?
In a way to scale most, no bottlenecks on shared resources.