Azure Storage, Comos DB Flashcards

1
Q

What is an Azure Cosmos DB account?

A

Needed to begin using Azure Cosmos DB,

The API determines the type of account to create. Azure Cosmos DB provides five APIs: Core (SQL) and MongoDB for document data, Gremlin for graph data, Azure Table, and Cassandra.

Currently, you must create a separate account for each API

Contains a unique DNS name that gets appended to documents.azure.com

Acts as an organizational entity for your databases. Azure Cosmos account –> Databases –> Containers (tables) –> Stored procedures, triggers, user-defined functions, etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

True or False? Azure Cosmos DB provides different data models

A

True:

SQL API, MongoDB, Cassandra, Azure Table and Gremlin (graph)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

To minimize Cosmos DB latency, what can you do on creating Azure Cosmos DB Account?

A

Select a location thats near to customers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

True or False? In Azure Cosmos DB, you provision throughput for your containers to run writes, reads, updates, deletes, and queries

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How is Azure Cosmos DB measuring throughput?

A

With request unit (RU). Request unit usage is measured per second, so the unit of measure is request units per seconds (RU/s)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

True or False? You must reserve the number of RU/s you want Azure Cosmos DB to provision in advance, so it can handle the load you’ve estimated, and you can scale your RU/s up or down at any time to meet current demand.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

True or false: The number of RUs used for a given database operation over the same data varies over time.

A

False, Azure Comos DB guarantees that the same query on the same data is consistent.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Which of the following options affects the number of request units it takes to write a document?

Size of the document

Item property count

Indexing policy

All of the above

A

All of the above

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Which of the following statements is false about Request Units (RUs) in Azure Cosmos DB?

The cost to read a 1 KB item is approximately one Request Unit (or 1 RU).

Requests are rate-limited if you exceed the number of provisioned RU.

Once you set the number of request units, it’s impossible to modify this number.

If you provision ‘R’ RUs on an Azure Cosmos container (or a database), Azure Cosmos DB ensures that ‘R’ RUs are available in each region associated with your account.

A

Once you set the number of request units, it’s impossible to modify this number.

Its always possible to change them from 400 to 250000.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

In Azure Comos DB, What is a partition strategy?

A

As your provisioned throughput or data size grows, Azure Cosmos DB will automatically create new physical partitions by splitting existing ones.

The throughput you select gets evenly distributed across physical partitions.

You want to choose a partition key that evenly distributes consumption across the physical partitions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

True or False? A partition key defines the partition strategy, it’s set when you create a container and can’t be changed

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

In Azure Comos DB, what is a partition key?

A

A partition key is the value by which Azure organizes your data into logical partitions. All the items in a logical partition have the same partition key value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Read Again:
When you’re trying to determine the right partition key and the solution isn’t obvious, here are a few tips to keep in mind.

Don’t be afraid of choosing a partition key that has a large number of values. The more values your partition key has, the more scalability you have.

To determine the best partition key for a read-heavy workload, review the top three to five queries you plan on using.

The value most frequently included in the WHERE clause is a good candidate for the partition key.

For write-heavy workloads, you’ll need to understand the transactional needs of your workload, because the partition key is the scope of multi-document transactions.

A

OK

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

True or false: You can add a partition key to an Azure Cosmos DB container after it has been created.

A

False, You can set the partition key only when the container is created.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Your organization is planning to use Azure Cosmos DB to store vehicle telemetry data generated from millions of vehicles every second. Which of the following options for your Partition Key will optimize storage distribution?

Vehicle Model

Vehicle Identification Number (VIN) which looks like WDDEJ9EB6DA032037

A

Vehicle Identification Number (VIN)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

True or False? Core (SQL) API is the default Api for Azure Cosmos DB

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

True or False? Core (SQL) API: You can query hierarchical JSON documents with a SQL-like language

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

True or False? at the lowest level Cosmos DB Stores Data in ARS Format (atom-record-sequence) and APIs are just ways to access and modify those.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Best API for creating projects from Scratch using Cosmos DB?

A

Core (SQL)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Best Approach when there is existing DB?

A

Use the appropriate technology, like Mongo or Cassandra or Gremlin. What ever fits best in the project environment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What API fits best for Azure Comos DB when Data consists of Key-value pairs

A

Earlier Redis and Table API, but today best fit is Core (SQL) because of richer query experience with improved indexing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What should be the key requirement when deciding for Gremlin

A

Gremlin (graph) is good when Relations between items have to be made. Like “How is this item related to that item”

A good example is a shop recommendation engine.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

How to determine the best fiting Comos DB Api based on a Problem description?

A

1) Is data unstructured?
2) Is there any existing technology used? If so, probably use even if there is a better choice to reuse code / reduce migrating time because of tech shift
3) Is migration downtime a issue? If not, prefer the better choice instead of 2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is CQL referring to?

A

Cassandra Query Language

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

When to use Azure Table?

A

When migrating from earlier Apis like Azure Table Storage or Redis. Then this is preferred instead of Core (SQL) when no downtime is wanted

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

The e-commerce application has a requirement to support a shopping basket. Customers can add and remove products, and any discounts (like buy one get one free) need to be kept in the basket. The sales team wants the flexibility to offer different kinds of discounts, and to add or remove different product categories.

A

This type of data is modeled best by documents. Core (SQL) is the best choice for a new system.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

The risk department has asked if the new project could implement some form of fraud detection and prevention. The guidance is that the fraud system would need to be able to track the relationship between customers, payment types, billing and delivery addresses, IP address, geolocation, and past purchase history. Anything that doesn’t fit into normal behavior should be flagged.

A

Complex relationships, and needed to store metadata against them is best supported by a graph mode of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

The sales team would like to offer a chat feature for customers. Messages will have a fixed number of characters and be simple. The schema is fixed, and the sales team has an existing chat app for which they have built up many CQL statements for creating reports. They would like to reuse them if possible.

A

The need to reuse existing CQL queries means that Cassandra is the best choice for in this scenario.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

True or False? Azure Cosmos DB provides the Data Explorer tool in the Azure portal that you can use to perform all these operations: adding data, modifying data, and creating and running stored procedures.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

True or False? The database account name must be unique across all Azure Cosmos DB instances.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What is the only required clause in a SQL query?

SELECT

FROM

WHERE

A

SELECT is the only required clause in a query.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Which query will get an ordered list of product IDs and descriptions, starting with the largest product ID and ending with the smallest product ID?

1) SELECT p.productId FROM Products p ORDER BY p.productId ASC
2) SELECT p.id FROM Products p ORDER BY p.productId DESC
3) SELECT p.productId, p.description FROM Products p ORDER BY p.productId DESC

A

3) SELECT p.productId, p.description FROM Products p ORDER BY p.productId DESC

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What does ACID mean?

A

Atomicity, Consistency, Isolation, Durability.

Transaction in a typical database can be defined as a sequence of operations performed as a single logical unit of work. Each transaction provides ACID property guarantees.

Atomicity: guarantees that all the operations done inside a transaction are treated as a single unit

Consistency: data is always in a valid state across transactions

Isolation: no two transactions interfere with each other

Durability: any change that is committed will always be present.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

True or false? Stored procedures are written in JavaScript and are stored in a container on Azure Cosmos DB

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

True or false? Stored procedures are the only way to achieve atomic transactions within Azure Cosmos DB; the client-side SDKs do not support transactions.

A

True

Stored procedures and triggers are intended to support transactional writes – meanwhile read-only logic is best implemented as application-side logic and queries using the Azure Cosmos DB SQL API SDKs,

36
Q

True or False? UDFs are used to extend the Azure Cosmos DB SQL query language grammar and implement custom business logic, such as calculations on properties and documents. UDFs can be called only from inside queries and, unlike stored procedures, they do not have access to the context object, so they cannot read or write documents.

A

True

37
Q

True or False? In Cosmos DB, a graph is a structure that’s composed of vertices and edges. Both vertices and edges can have an arbitrary number of properties.

A

True

38
Q

Explain what a Vertices or Node means in a Cosmos DB Graph

A

Vertices represent objects. For example: a person, a place, or a product.

39
Q

Explain what a Edges or Relationships means in a Cosmos DB Graph

A

Edges denote relationships between vertices. For example: a person might know another person, or have visited a place.

40
Q

Explain what a Property means in a Cosmos DB Graph

A

Properties express information about the vertices and edges.

For example:
Vertices properties might include the name and age of a person.
Edge properties might include a time stamp of a purchase or a hierarchical affiliation between coworkers.

41
Q

In which of the following scenarios would a graph database, as opposed to another database model, be the best fit?

1) Where data is disconnected and relationships do not matter
2) A data model where the ability to expand the model to add new data or relationships is necessary
3) If your queries are doing table scans to find a match or searching for data

A

2) A data model where the ability to expand the model to add new data or relationships is necessary

Perfect for graph databases, with a graph data model, changes to the data model can be made with little or no impact to the application

42
Q

Which of these answers is a primary strength of graph databases, as compared to other database models?

1) Performance
2) Storage
3) Where data models stay consistent and the structure is fixed and tabular

A

1) Performance

A graph database has superior performance for querying related data. Graph databases are designed to traverse stored data quickly and retrieve results in milliseconds

43
Q

True or False? An entity, in a NoSQL table, is the equivalent of a row in a relational database table.

A

True

44
Q

True or False? Its possible migrate Tables located in Azure Storage accoutns over to Cosmos DB

A

True

45
Q

Suppose you are using Visual Studio Code to develop a .NET Core application that accesses Azure Cosmos DB. You need to include the connection string for your database in your application configuration. What is the most convenient way to get this information into your project?

1) Directly from Visual Studio Code
2) From the Azure portal
3) Using the Azure CLI

A

1) Directly from Visual Studio Code

46
Q

When working with Azure Cosmos DB’s SQL API, which of these can be used to perform CRUD operations?

1) LINQ
2) Apache Cassandra client libraries
3) Azure Table Storage libraries

A

1) LINQ

47
Q

When working with the Azure Cosmos DB Client SDK’s DocumentClient class, you use a NOSQL model. How would you use this class to change the FirstName field of a Person Document from ‘Ann’ to ‘Fran’?

1) Call UpdateDocumentAsync with FirstName=Fran
2) Call UpsertDocumentAsync with an updated Person object
3) Call ReplaceDocumentAsync with an updated Person object

A

3) Call ReplaceDocumentAsync with an updated Person object

ReplaceDocumentAsync will replace the existing document with the new one. In this case we’d intend the old and new to be the same other than FirstName.

48
Q

True Or False? The partitioning configuration can be changed after a collection is provisioned.

A

False, partitioning is fixed when collection is created

49
Q

True or False? In Cosmos DB / Core SQL you can change the index for a collection at anytime

A

True, unlike partitioning indexes are not fixed and can be changed at anytime.

50
Q

True or False? When executing a SQL in Portals Data Explorer, its possible to see the RU/s for a Query that has been executed

A

True, Select Query Stats

51
Q

True or false? Quering a Document within by using the Partition Key is cheaper that receving a document by other properties

A

True

52
Q

What causes a hot partition in a distributed NoSQL database collection?

1) A partition key with a large number of values.
2) A partition key that doesn’t distribute requests evenly over storage and time.
3) A read-heavy workload.
4) Querying a large volume of data.

A

2) Spread your write workload across partitions.

Access that’s concentrated to fewer values of the partition key can cause bottlenecks on a single partition.

53
Q

What’s the best way to maximize the efficiency of an individual database query?

1) Minimize the number of queries to the database.
2) Spread your write workload across partitions.
3) Design a partition key strategy so that your most frequent queries don’t cross partitions.

A

3) Design a partition key strategy so that your most frequent queries don’t cross partitions.

Querying within a partition places far fewer demands on Azure Cosmos DB than querying across partitions.

54
Q

Read again

Partition design considerations:
https://docs.microsoft.com/en-us/learn/modules/monitor-and-scale-cosmos-db/5-partition-lesson

A

Ok

55
Q

Indexing all properties in a collection results in:

1) Higher-demand writes to the database and lower-demand queries.
2) Lower-demand writes to the database and higher-demand queries.
3) Higher-demand writes and queries.

A

1) Higher-demand writes to the database and lower-demand queries.

More resources are consumed when Azure Cosmos DB is updating the index. But using the index to find documents is more efficient.

56
Q

For read-heavy workloads with an unknown query pattern, the best starting indexing strategy is:

1) Set the indexing policy to none.
2) Index all properties.
3) Index only the specific properties that will be queried.
4) Index only the id property of your documents.

A

2) Index all properties.

This is a safe choice because you can perform queries efficiently. It’s also the Azure Cosmos DB default.

57
Q

True or False? Collections are distributed across partitions based on the value of a collection’s partition key.

A

True

58
Q

True or False? The partition key is NOT a document property.

A

False, its always a document property.

59
Q

In Azure Cosmos DB, Explain What hot partition means

A

A hot partition is accessed more than the other partitions. The result is an inefficient use of the total configured throughput. If the demand on the hot partition is high enough, the partition becomes overloaded and traffic to the database is rate-limited.

60
Q

In Cosmos DB, How to Identify a partition strategy?

A

Estimate the scale of your data needs => Size of ducments and read / writes per second. How many queries

Understand the workload => Is it read or write heavy? If read heavy, what are the top five queries. If write heavy, transactions required?

61
Q

In Cosmos DB, Propose some partition key options

A

Does the key choice have a large number of possible values or large cardinality?

Do the values have a consistent spread across the data?

Are some values accessed more than others?
For read-heavy workloads, can the query be within a single partition?

For write-heavy transactional workloads, can the transaction be within a single partition?

62
Q

In Cosmos DB, name two common scenarios for replicating data in two or more regions

A

1) Delivering low-latency data access to end users no matter where they are located around the globe
2) Adding regional resiliency for business continuity and disaster recovery (BCDR)

63
Q

True or false: the benefits of writing to multiple regions are decreased latency, unlimited scaling potential, and improved availability.

A

True

Writing to multiple regions has many performance benefits. For example, the latency for write operations is less than in non-multi-master accounts.

64
Q

What is the default conflict-resolution policy in a multi-master account?

1) Last-Writer-Wins
2) A custom user-defined procedure

A

1) Last writer wins

65
Q

Which consistency level is most appropriate for the user data in an e-commerce database? Users need to ensure that their orders contain all the items they placed in their basket.

1) Strong
2) Bounded Staleness
3) Session
4) Consistent Prefix
5) Eventual

A

3) Session

Session is the best consistency setting for user data that contains shopping basket information. Session consistency will ensure that every item the user put in their basket is displayed when they review their basket.

66
Q

Which consistency level consumes the least amount of request units per operation?

1) Strong
2) Bounded Staleness
3) Session
4) Consistent Prefix
5) Eventual

A

5) Eventual

67
Q

In Cosmos DB, what is required to change the partition key?

A

In general its not possible to change it. If it is required to change the partition key, data must be moved to new container.

68
Q

Read again:
For all containers, your partition key should ?
(Name at least two)

A
  • Be a property that has a value which does not change.

have a wide range of possible values (high cardinality).

*Spread request unit (RU) consumption and data storage evenly across all logical partitions.

69
Q

How to distribute data globally for Azure Cosmos DB

A

In the Settings, select replicate data globally and then choose the datacenters from the map. Choose those that are close to your users.

70
Q

In Cosmos DB, what is the following Code doing, whats the purpose?

ConnectionPolicy connectionPolicy = new ConnectionPolicy();

//Setting read region selection preference
connectionPolicy.PreferredLocations.Add(LocationNames.WestUS); // first preference
connectionPolicy.PreferredLocations.Add(LocationNames.EastUS); // second preference
connectionPolicy.PreferredLocations.Add(LocationNames.NorthEurope); // third preference

A

With the connection policy the client developer can specify the order preferece list to take advantage of global distribution.

71
Q

Name the 5 Consitency Levels for Azure Cosmos DB

A

Strong, Bounded Staleness, Session, Consistent Prefix, Eventual

72
Q

What is a Cosmos DB change feed?

A

Change feed in Azure Cosmos DB is a persistent record of changes to a container in the order they occur.

You can read the change feed as far back as the origin of your container but if an item is deleted, it will be removed from the change feed.

The changes are persisted, can be processed asynchronously and incrementally, and the output can be distributed across one or more consumers for parallel processing.

73
Q

How is a container scaled?

A

By distributing data and throughput across physical partitions.

Physical partitions hold logical partitions.

Each individual physical partition can store up to 50GB data

As your provisioned throughput or data size grows, Azure Cosmos DB will automatically create new physical partitions by splitting existing ones.

74
Q

What does data consistency refer to?

A

The freshness of the data. Is the data that you’re reading the most recent version of the data?

75
Q

What is a RU?

A

Request Unit.

No matter which API you use to interact with your Azure Cosmos container, costs are always measured by RUs. Whether the database operation is a write, point read, or query, costs are always measured in RUs.

The cost to do a point read (i.e. fetching a single item by its ID and partition key value) for a 1 KB item is 1 Request Unit (or 1 RU). All other database operations are similarly assigned a cost using RUs.

76
Q

What is throughput?

A

The number of RUs per second

77
Q

In a stored procedure, what is the context object?

A

The context object provides access to all operations that can be performed in Azure Cosmos DB, as well as access to the request and response objects.

Triggers also have access.

UDFs do not have access to the context object.

78
Q

What is an example of a UDF?

A

A UDF to calculate income tax for various income brackets. This user-defined function would then be used inside a query.

79
Q

How does Cosmos DB store data for all data models?

A

In ARS format, then it provides a view of that data in the style of the model you choose.

80
Q

When should you use the Table API?

A

This API should only be used to allow existing apps that are based on the Table API access to Azure Cosmos DB. However, new projects should always choose Core (SQL).

81
Q

When would you choose MongoDB over Core (SQL)?

A

Both store data in document format, but if you need to continue using existing MongoDB client SDKs, drivers, and tools to interact with the data transparently, use the MongoDB API in CosmosDB.

82
Q

How is Core (SQL) ideal when product categories need to be frequently updated?

A

Core (SQL) schema is flexible and requires a schemaless data store. As a result of this architecture, bringing a new product category online is as simple as adding a document for the new product. Changes to the schema or taking the database offline are not required.

83
Q

Denormalization is a concept related to containers and …

XML settings
App settings
Partitions

A

Partitions

84
Q

True or false: you can have up to 2 free tier Azure Cosmos DB accounts per subscription.

A

False. Only one free tier account per subscription.

85
Q

Azure Cosmos DB supports wire protocol-compatible APIs for popular databases.

A

True. Can communicate directly with the db. Don’t have to go through client libraries.

86
Q

When you enable autoscale on an existing database or container, the starting value for max RU/s is determined by the system, based on your current manual provisioned throughput settings and storage. After the operation completes, you can change the max RU/s if needed. True or false.

A

True

87
Q

You can enable autoscale on a single container, or provision autoscale throughput on a database and share it among all the containers in the database.

A

True