Data Management Patterns Flashcards

Question

What are the strategies for implementing the Index Table pattern?

Answer 1

* **Complete denormalization** - Duplicate keys in the index table and organize it by different keys * **Create normalized index tables** organized by different **keys derived from the frequently accessed fields** and **reference original data by primary key** rather then duplicating it * Create **partially normalized index tables** that are organized by **keys that duplicate frequently access fields**

Answer 2

* Populate views over data in one or more data stores when the data isn't ideally formatted for required query operations

Answer 3

When storing data primary focus is on the format of data, how to store data, and integrity of data. Priority consideration isn't given to how the data will be read and this negatively effects query performance

Answer 4

* Data formatted to suite the results set of query * Contain data only needed for the query * Include current values of data or values of calculated columns * Is completely disposable and can be rebuild from the data store * It's a specialized cache so it can never be updated by an application * When data changes the view must be updated

Answer 5

* How and when to update the data * A primary use case is with event source pattern where view can only be generating replayed events * Tend to be used where a few queries exist * With large numbers of queries storage costs will increase * Lack of data consistency between the view and the data in the data store * Data is transient and only used to improve query performance or scalability by reflecting current state of the data * View can be rebuilt so can be stored in a less reliable location

Answer 6

* Provides **view** for data that is **difficult to query** directly * Where **improved** query **performance** is needed * **Connection** to data store **isn't always** available * **Abstract** away how data is **stored** where there are **different** sources of **data** needed **to be combined** to retrieve **relevant** info * Provide **access** to **subsets of data** that **shouldn't** be **readily accessible** due to privacy or security concerns * **Bridge** data **stores** to take **advantage** of their unique **capabilities** * Provide **consolidated views** from data retrieved from **different** **microservices**

Answer 7

* Data source is **simple** and easy **to query** * Data **changes** very quickly * Data **consistency** between source and view is **high priority**

Answer 8

Dividing a data store into a set of horizontal partitions [link](https://docs.microsoft.com/en-us/azure/architecture/patterns/sharding)to source

Answer 9

* When there's a need to avoid the vertical upper bound limits of a single storage node by scaling out **Horizontally** * When there's a need for splitting data across multiple **Data stores** to avoid hitting upper bound on how much data can be stored on a single data store * When there's a need for large numbers of concurrent user access **Network & Computing resources** for a single data store are limited * When there are data compliance regulations that may require storing info in region where data entered * When there's a need to reduce latency to data that's accessed from different regions [link](https://docs.microsoft.com/en-us/azure/architecture/patterns/sharding)to source

Answer 10

* When a data store will likely need to scale beyond the resources available to a single storage node * When you need to improve performance by reducing contention in a data store * A byproduct is higher availability due to data being separated into separate partitions [link](https://docs.microsoft.com/en-us/azure/architecture/patterns/sharding)to source

Answer 11

* Subsets of data are partitioned across multiple shards. * Each Shard represents a data store. * All Shard data stores have the same schema [link](https://docs.microsoft.com/en-us/azure/architecture/patterns/sharding)to source

Answer 12

* The data contains items that fall within a specified range determined by attributes of the data [link](https://docs.microsoft.com/en-us/azure/architecture/patterns/sharding)to source

Answer 13

* Its a static value that represents attributes which represent a range of the data stored in a particular data store associated with the Shard. [link](https://docs.microsoft.com/en-us/azure/architecture/patterns/sharding)to source

Answer 14

* Range * Lookup * Hash - logic implements a hash based on attributes of the data store to route data requests to the appropriate Shard [link](https://docs.microsoft.com/en-us/azure/architecture/patterns/sharding)to source

Answer 15

* Logic implements a hash based on attributes of the data store to route data requests to the appropriate Shard [link](https://docs.microsoft.com/en-us/azure/architecture/patterns/sharding)to source

Answer 16

* Logic implements a map that routes a request for data to the Shard that contains the data [link](https://docs.microsoft.com/en-us/azure/architecture/patterns/sharding)to source

Answer 17

* More control over the way Shards are configured and used * Impact is reduced when rebalancing data. Can add physical partitions to even out the workload [link](https://docs.microsoft.com/en-us/azure/architecture/patterns/sharding)to source

Answer 18

* Looking up Shard locations can impose an additional overhead * Requires state to be highly cacheable and replica friendly [link](https://docs.microsoft.com/en-us/azure/architecture/patterns/sharding)to source

Answer 19

* Related items are grouped together and ordered by Shard key [link](https://docs.microsoft.com/en-us/azure/architecture/patterns/sharding)to source

Answer 20

* Easy to implement * Works well with range queries [link](https://docs.microsoft.com/en-us/azure/architecture/patterns/sharding)to source

Answer 21

* Optimal balancing across Shards is hard to achieve * Hard to rebalance Shards and this may not solve performance problems * Data movement or scaling must be done when data is all or partly offline * State may have to be maintained that maps ranges to partitions [link](https://docs.microsoft.com/en-us/azure/architecture/patterns/sharding)to source

Answer 22

* Offers a better chance of distribution load and data evenly * Routing can be achieved directly using the hash. Don't need a map [link](https://docs.microsoft.com/en-us/azure/architecture/patterns/sharding)to source

Answer 23

* Difficult to rebalance Shards * Computing hash may impose additional overhead [link](https://docs.microsoft.com/en-us/azure/architecture/patterns/sharding)to source

Answer 24

* It's difficult to maintain referential integrality and consistency across Shards [link](https://docs.microsoft.com/en-us/azure/architecture/patterns/sharding)to source

Answer 25

* Deploy static content to a 3rd party cloud-based storage service [link](https://docs.microsoft.com/en-us/azure/architecture/patterns/static-content-hosting) to source

Answer 26

* Allows you to offload the server from having to render static content. * Cloud storage is much cheaper then compute resources [link](https://docs.microsoft.com/en-us/azure/architecture/patterns/static-content-hosting) to source

Answer 27

* Deployment of the application * Securing access to data that isn't meant to be available for anonymous user usage [link](https://docs.microsoft.com/en-us/azure/architecture/patterns/static-content-hosting) to source

Answer 28

* Use a CDN(content delivery network) to cache the storage contents in data centers around the world * Securing access to data that isn't meant to be available for anonymous user usage [link](https://docs.microsoft.com/en-us/azure/architecture/patterns/static-content-hosting) to source

Answer 29

* Use a URL as apposed to an IP address. The IP address may change due to availability of Storage [link](https://docs.microsoft.com/en-us/azure/architecture/patterns/static-content-hosting) to source

Answer 30

* You might have to perform separate deployments * You might have to version and content to manage it more easily. When scripts are involved this is important [link](https://docs.microsoft.com/en-us/azure/architecture/patterns/static-content-hosting) to source

Answer 31

* Storage may not support custom domain names * Application maintenance is more difficult if custom domain names aren't supported then you'll need to provide the full URL(which will be in a different domain) to the resource. * Storage must be configured for public Read access and it's vital to ensure that write access is disabled to prevent unauthorized uploads. [link](https://docs.microsoft.com/en-us/azure/architecture/patterns/static-content-hosting) to source

Data Management Patterns Flashcards

(55 cards)