Chapter 11&12 Knowledge Testers Flashcards

1
Q

Do you understand how collections in documents store generalize the concept of a relational table?

A

Document stores is collection of trees, relational table is collection of tuples.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Can you explain what documents store can do, but that relational databases cannot do ?

A

e.g., heterogeneous collections, schema-less collections, data denormalized into trees…

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Can you explain how, in a document store, the documents can be sharded and replicated? Can you contrast the architecture with that of HDFS or HBase?

A

Shards can be determeined by selecting one or several fields. Similar to regions in HBase. SHards are stored in different physical locations. Fields are organized in a tree index structure. Each shard is assigned to exactly one replica set (the nodes within replica set each have a copy of the same data). Not the same as HDFS, in HDFS the replicas are spread over the entire cluster, no notion of walls, and no two datanodes have the same block replicas. Not the same as HBase, in HBase some nodes receive responsibility of handling regions, but not necessarily store them physically (storage is on HDFS). For writing we require that a minimum number of nodes have successfuly written the data (similar to W for dynamo DB).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Do you understand how indices can make queries faster, like in relational databases?

A

Index creates a mapping to a sorting of the data, so the machine knows where to look. This makes querying much faster.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Do you know what kinds of indices there are? Do you know how efficient they are, and what each of them can and cannot do?

A

Hash - map value to an integer (hash it with hash function). Pointer from hashed integer to document to access in O(1) time. Building the index takes time though. Need big array to avoid colisions, can take up a lot of space. Sync or async, but both have down sides. No range queries.
Tree Indices - Work for range queries, use a tree to store the indices. O(logn) look up time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Are you capable of telling if an index is useful to a given query, for simple settings?

A

useful: an index on a single field and a query that selects on that field
not useful: index on field 1, query on field 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Do you know that, and why, a compound index (e.g., on keys a and b) can also be used as an index on any prefix of the compound key (e.g., on key a only) ”for free”?

A

if it is sorted by a then b, then we can also search on just a. We can not use index for just b though.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Can you explain the limitations of a document store like MongoDB?

A

No joins

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Can you name various kinds of items in the JDM?

A

Items in the JSON Data Model (JDM) include atomic items (e.g., strings, numbers, booleans), arrays, objects, and null value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Can you name a few query languages in the XML/JSON ecosystem?

A

XPath, XQuery, JSONiq, and JAQL.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Do you understand how to navigate nested structures in JSONiq (object lookup, array lookup, array unboxing, filtering predicates)?

A

Yes, JSONiq uses . for object lookup (e.g., $data.key), array indexing with square brackets (e.g., $data.array[1]), and predicates for filtering (e.g., $data.array[?(@.key eq “value”)]).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Do you understand how FLWOR expressions work and describe what they return?

A

Yes, FLWOR expressions use clauses like for, let, where, order by, and return to iterate over, filter, sort, and transform data, returning sequences of items.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Can you name and describe the first-class citizen of the JSONiq Data Model: a sequences of item?

A

The first-class citizen of the JSONiq Data Model is a sequence of items, which is an ordered collection of zero or more items. Each item in the sequence can be an atomic value (e.g., number, string) or a structured value (e.g., object, array), and sequences allow flexible representation and manipulation of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly