Select the appropriate storage option Flashcards

1
Q

Overview

A

1) Understanding data access patterns
2) Selecting a data storage solution
3) Evaluating data storage qualities

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Read/write patterns

A

1) Tradeoffs between performance and consistency often need to be made so that these readers and writers can work together in harmony.
2) Depending on the number of readers and writers, the frequency of data operations, and concurrency requirements, you’ll need different strategies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Immediate Consistency

A

1) all observers of this entity will have a consistent view of the updated entity.
2) to ensure strong consistency, some sort of locking method needs to be used, preventing the application from accessing the data for the duration of data updates.
3) when there are multiple replicas of data, maintaining consistency across the replicas needs extra processing cycles before the update is available for reading across all of the replicas.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Eventual Consistency

A

1) certain replicas of data to be updated ahead of others, and updates are eventually propagated across the entire replica set
2) very fast in some systems, such as a well-connected Redis cluster.
3) takes longer in DNS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Pessimistic concurrency

A

1) holds a quite pessimistic view of possible update conflicts.
2) So, it asks the database to lock the item during its update operations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Optimistic concurrency

A

1) considers the probability of conflicts being relatively low.
2) When it reads data for update, it keeps a revision number of the original data but it doesn’t hold a lock.
3) Then, as it writes back, as long as the latest revision number on the database is still the same as it has read, the update will succeed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Application type and concurrency

A

1) large amount of writes -> pessimistic concurrency, because optimistic concurrency will lead to many failed operations.
2) mostly read - you can use optimistic concurrency
3) simpler method: last-write wins.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Dynamic schema, or schema-less, or NoSQL

A

1) don’t have enforced schema
2) Data is saved in the database as key–value pairs.
3) Complex queries are often difficult on such databases. 4) Most data retrievals are done by providing specific keys or key ranges.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

To enable complex queries on NoSQL databases

A

1) use advanced indexing techniques such as full-text indexing.
2) In some cases, schemas can be inferred from data to support complex queries.
3) For example, you can infer a schema from JSON documents by treating each property as a field.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Combination of data stores

A

1) SQL Database for your transactional data,
2) Azure Blob for large binary files,
3) DocumentDB for loosely structured data, and
4) Azure Search for indexing free-text files.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Selection data store solutions

A

1) Combination of data stores
2) Keep data close to compute
3) Cost matters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

ctor Pattern, or Actor Model,

A

A is a good example of an entity keeping its status close to itself. An actor can access and update its states quickly because its state is kept local. There’s no external service calls or additional network hops to go through. Moreover, because an actor is the only writer to its own state, you can achieve maximum parallelism without locking.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Reduce storage cost

A

1) Layered storage and periodic data reduction can help you to control of your storage costs
2) reduce data size by using compression and periodic trimming
3) Layered storage keeps “hot” data in a more capable storage to provide rich interactions, such as complex queries and faster accesses, whereas it keeps “cold” data in cheaper storage to reduce overall costs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Evaluating data storage qualities

A

1) Most applications require data stores to be scalable and reliable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Reliability

A

1) use replicas

2) keeps three copies of your data automatically ( third-party data stores such as Redis, MongoDB, and MySQL.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Scalability

A

1) data sharding
2) Shard map management (SMM)
3) Data dependent routing (DDR)
4) Multishard query (MSQ)

17
Q

Shard map management (SMM)

A

Defines groups of shards for your application and manages mapping of routing keys to shards. With Elastic Scale, you can dynamically reallocate tenants to different shards as loads from particular tenants change. You can monitor the performance of each shard and split busy shards, or merge idle shards dynamically as needed.

18
Q

Data dependent routing (DDR)

A

Routes incoming requests to the correct shard (such as routing by tenant ID) and ensures correct routing as tenants move. DDR makes writing queries against shared databases easy

19
Q

DDR usage

A

DDR helps you to route the query to the appropriate shard where the tenant data resides. To avoid repetitive queries to SMM, the Elastic Scale client library also provides a route map cache so that the client application doesn’t need to continuously utilize SMM to look up shards.

20
Q

Multishard query (MSQ)

A

Interactively processes data across multiple shards. For example, you can execute the same statement on all shards and get results in a T-SQL UNION ALL semantic.

21
Q

Adopting Multitenancy

A

Adapting a single-tenant application for multitenancy is a serious architectural change that should not be taken lightly. Such change often needs modifications across all application layers, so it’s a high-risk change. More important, multitenancy often has profound impacts on business processes, such as sales strategy, support workflow, and version managements