Select the appropriate storage option Flashcards
Overview
1) Understanding data access patterns
2) Selecting a data storage solution
3) Evaluating data storage qualities
Read/write patterns
1) Tradeoffs between performance and consistency often need to be made so that these readers and writers can work together in harmony.
2) Depending on the number of readers and writers, the frequency of data operations, and concurrency requirements, you’ll need different strategies
Immediate Consistency
1) all observers of this entity will have a consistent view of the updated entity.
2) to ensure strong consistency, some sort of locking method needs to be used, preventing the application from accessing the data for the duration of data updates.
3) when there are multiple replicas of data, maintaining consistency across the replicas needs extra processing cycles before the update is available for reading across all of the replicas.
Eventual Consistency
1) certain replicas of data to be updated ahead of others, and updates are eventually propagated across the entire replica set
2) very fast in some systems, such as a well-connected Redis cluster.
3) takes longer in DNS
Pessimistic concurrency
1) holds a quite pessimistic view of possible update conflicts.
2) So, it asks the database to lock the item during its update operations.
Optimistic concurrency
1) considers the probability of conflicts being relatively low.
2) When it reads data for update, it keeps a revision number of the original data but it doesn’t hold a lock.
3) Then, as it writes back, as long as the latest revision number on the database is still the same as it has read, the update will succeed.
Application type and concurrency
1) large amount of writes -> pessimistic concurrency, because optimistic concurrency will lead to many failed operations.
2) mostly read - you can use optimistic concurrency
3) simpler method: last-write wins.
Dynamic schema, or schema-less, or NoSQL
1) don’t have enforced schema
2) Data is saved in the database as key–value pairs.
3) Complex queries are often difficult on such databases. 4) Most data retrievals are done by providing specific keys or key ranges.
To enable complex queries on NoSQL databases
1) use advanced indexing techniques such as full-text indexing.
2) In some cases, schemas can be inferred from data to support complex queries.
3) For example, you can infer a schema from JSON documents by treating each property as a field.
Combination of data stores
1) SQL Database for your transactional data,
2) Azure Blob for large binary files,
3) DocumentDB for loosely structured data, and
4) Azure Search for indexing free-text files.
Selection data store solutions
1) Combination of data stores
2) Keep data close to compute
3) Cost matters
ctor Pattern, or Actor Model,
A is a good example of an entity keeping its status close to itself. An actor can access and update its states quickly because its state is kept local. There’s no external service calls or additional network hops to go through. Moreover, because an actor is the only writer to its own state, you can achieve maximum parallelism without locking.
Reduce storage cost
1) Layered storage and periodic data reduction can help you to control of your storage costs
2) reduce data size by using compression and periodic trimming
3) Layered storage keeps “hot” data in a more capable storage to provide rich interactions, such as complex queries and faster accesses, whereas it keeps “cold” data in cheaper storage to reduce overall costs
Evaluating data storage qualities
1) Most applications require data stores to be scalable and reliable
Reliability
1) use replicas
2) keeps three copies of your data automatically ( third-party data stores such as Redis, MongoDB, and MySQL.)