Explore Cosmos DB Flashcards

1
Q

What is cosmos DB?

A
  • Fully managed NOSQL DB designed to provide low latency, elastic scalability of throughput, well-defined semantics for data consistency and high availability
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the benefits of cosmos DB global distribution?

A
  • can achieve low latency by placing in the region closest to users
  • can add or remove regions associated with an account at any time
  • the app doesn’t need to be paused or redeployed to add or remove region
  • every region supports reads and writes with 99.999% availability
  • Guaranteed reads and writes served in less than 10ms at the 99th %tile
  • CosmosDB internally handles the data replication between regions with consistency-level guarantees
  • if one region goes down the others will pick up the load
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a cosmos DB account?

A
  • Fundamental unit of global distribution and high availability
  • contains unique DNS name
  • managed via portal, CLI or SDKS
  • can add or remove regions to your account at any time
  • can create 50 accounts under a subscription
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a cosmosDB container?

A
  • fundamental unit of scalability
  • you can virtually have unlimited provisioned throughput (RU/s) and storage on a container
  • ComosDB transparently partitions container using the logical partition key that you specify to scale your provsioned throughput and storage elastically
  • container is a schema agnostic container of items
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the cosmos DB hierarchy?

A
  • Accounts -> databases -> containers -> items (stored procs, functions, triggers etc)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the defintion of a cosmosDB?

A
  • unit of management for a set of azure cosmos DB containers
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How is a container partitioned?

A
  • Horizontally partitioned and then replicated across multiple regions
  • items you add to it are auto grouped into logical partitions which are distributed across physical partitions based on a partition key
  • throughput is evenly distributed across physical partitions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How is throughput on a container configured?

A
  • Dedicated provisioned throughput mode; the throughput provisioned on a container is exclusively reserved for the container and its backed by the SLAs
  • shared provisioned throughput mode; containers share the throughput with other containers in the same DB
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a cosmosDB item?

A
  • depending on which API you use an item can be a doc in a collection, a row in a table or a node or edge in a graph
  • can have arbitrary schemas
  • by default all items that you add to a container are automatically indexed without requiring explicit index or schema management
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How does cosmosDB approach data consistancy?

A
  • As a spectrum of sources
  • strong consistency and eventual consistency are at the ends of the spectrum
  • the further away from strong you are the higher availability, lower latency and higher throughput you will have
  • region agnostic
  • CosmosDB guarantees that 100% of read requests meet the consistency guarantee for the consistency level chosen
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the levels of data consistency?

A
  • strong
  • bounded staleness
  • session
  • consistent prefix
  • eventual
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How can consistency modals be used?

A
  • each one can be used for specific real-world scenarios
  • you can configure detail consistency level on your azure cosmos DB account at any time
  • it applies to all cosmos DB databases and containers under that account
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the strong consistency level of data consistency?

A
  • offers linearizability (serving requests concurrently) guarantee
  • reads are guaranteed to return the most recent committed version of an item
  • client never sees an uncommitted or partial write
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the bounded stalness level of data consistency?

A
  • reads are guaranteed to honour the consistent-prefix guarantee
  • might lag behind writes by at most X versions (updates) of an item by Y time interval, whichever is reached first
  • X and Y are staleness
  • for single region min value of x and y is 10 write operations or 5 seconds
  • for multi region min values are 100k and 300s
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what is the session consistancy level of data consistency?

A
  • within a single client session reads are guaranteed to honour the consistent-prefix, monotonic reads, monotonic writes, read-your-writes, and write-follows-reads guarantees
  • assumes a single writer session or sharing the session token for multiple writers
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the consistent-prefix level of data consistency?

A
  • updates made as single doc writes see eventual consistency
  • updates made as a batch within a transaction are returned consistent to the transaction in which they were committed
  • write operations within a transaction of multiple docs are visible together
  • assume 2 write operations performed on DOC1 and DOC2 by transactions T1 and T2
  • when the client reads they will see either DOC1 v1 and DOC2 v1 or DOC1 v2 and DOC2 v2 never DOC1 v2 and DOC2 v2
17
Q

What is the eventual level of data consistency?

A
  • no ordering guarantees for reads
  • replicas eventually converge
  • weakest form as client may read the values that are older than the ones it read before
  • ideal when app doesn’t require ordering guarantees
  • e.g. retweets, likes or nonthreaded comments
18
Q

What APIs does cosmos DB offer?

A
  • NoSQL
  • MongoDB
  • PostgreSQL
  • Apache cassandra
  • Table
  • Apache Gremlin
19
Q

What are the benefits of cosmos DB offering multiple APIs?

A
  • allows us to modal real-world data using docs, key/value, graph and column data models
  • allows apps to treat cosmosDB as if it were various other DB technologies without the overhead of management and scaling approaches
20
Q

Which of the APIs ar native to cosmosDB

A
  • API for NoSQL
  • The rest implement the wire protocol of open source DB engines best suited for
    – if you have existing apps using those technologies
    – you don’t want to rewrite your entire data access layer
    – you want to use open-source dev ecosystem
21
Q

What does the API for NoSQL provide?

A
  • stores data in doc format
  • best end-to-end experience as we have full control over interface, service and SDKs
  • any new features rolled out for cosmosDB are available here first
22
Q

What does the API for mongoDB provide?

A
  • stores data in doc BISON format
  • doesn’t use any native mongoDB related code
  • combines mongoDB ecosystem with cosmosDB features
23
Q

What does the API for postgreSQL provide?

A
  • managed service for running postgreSQL at any scale
  • stores data on single node or distributed in multi node config
24
Q

What does API for cassandra provide?

A
  • column orientated
  • highly orientated, horizontally scaling approach to storing large volumes of data while offering flexible approach to column orientated schema
25
Q

What does API for gremlin provide?

A
  • allows users to make graph queries and stores data as edge and vertices
  • useful for scenarios; involving dynamic data, data with complex relations, desire to use Gremlin
26
Q
  • What does API for table provide?
A
  • key/value format
  • if you use azure table storage you may see some limitations in latency, scaling and throughput
  • this API overcomes these issues
27
Q

What do we pay for in cosmosDB?

A
  • throughput you provision and the storage you consume hourly
  • throughput must be provisioned to ensure sufficient system resources are available for your DB at all times
28
Q

What is a Request Unit (RU)

A
  • with cosmosDB operations is normalised and expressed as an RU
  • represent the system resources such as CPU, IOPS and memory that are required to perform the DB operations supported by cosmos DB
29
Q

How much do operations cost in regards of RUs?

A
  • the cost to do a point read (fetching a single item by ID and partition key value) for a 1KB item is 1RU
  • all other operations are assigned a cost using RUs
  • other CRUD operations (except READ) have variable number of RUs depending on the complexity of the operation
30
Q

What are the modes we can create a cosmos DB account in?

A
  • provisioned throughput mode
  • serverless mode
  • autoscale mode
31
Q

What is the provisioned throughput mode for a cosmos DB account?

A
  • you provision the number of Rus for you app on a per second basis in increments of 100 RUs per second
  • you can increase or decrease RUs at any time to scale the provisioned throughput for app
  • can make change programmatically or via portal
32
Q

What is the serverless mode for a cosmos DB account?

A
  • you dont have to provision throughput when creating resources in cosmos account
  • at end of billing period you get billed for number of RUs that have been used
33
Q

What is the autoscale mode for a cosmos DB account?

A
  • you can auto and instantly scale the throughput RUs of your DB or container based on usage
  • doesn’t affect availability, latency, throughput or performance of the workload
  • well suited for mission-critical workloads that have variable or unpredictable traffic patterns and require SLAs on a high performance and scale