XQueries Flashcards

1
Q

Xqueries

A

A basic definition is to say they are SQL for XML files, and is an extension of XPaths.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

FLWR

A

More general XQuery.

F -> for
L -> let
W -> where
R -> return

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

FLWR Example

A

let $doc := doc(“mydoc.xml)
for $s in $doc/university/student
where $s/module = “Computer Science”
return $s/name

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

let $doc := doc(“mydoc.xml)

A

The let clause, where we define the document we are going into.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

for $s in $doc/university/student

A

The for clause, where we define the for loop.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

where $s/module = “Computer Science”

A

The where clause, specifying conditions for the return.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

return $s/name

A

The return clause, aka what is returned.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Return clause pairs

A

<pair>{variableone},{variabletwo}</pair>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is returned

A

If we were doing this example:
let $doc := doc(“mydoc.xml)
for $s in $doc/university/student
where $s/module = “Computer Science”
return $s/name
an example for a return could be:

<name>Anna</name>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Order by

A

let $doc := doc(“mydoc.xml)
for $s in $doc/university/student
where $s/module = “Computer Science”
order by $s/name
return $s/name

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Group by

A

let $doc := doc(“mydoc.xml”)
for $m in $doc/university/student/module
group by $mod:=$m
return <pair>{$mod},{count($m)}</pair>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Distinct values

A

distinct-values(let $doc := doc(“mydoc.xml)
for $m in $doc/university/student/module
return $m)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Three ways of XML to SQL

A
  • Store XML documents as entries of a table.
  • Store XML documents in schema-independent form.
  • Store XML documents in shredded form across a number of attributes and relations.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Storing XML as an attribute

A

Raw XML is stored in serialised form, which makes it efficient to insert document into database and retrieve them in their original form since it is all serial.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Storing XML as a schema-independent representation

A

In simple terms, this means storing it as a tree structure in our database.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Recursive nature of schema-independent representation

A

Can cause problems, since if we had to retrieve something three layers deep, that’s three recursions, aka three queries.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Denormalised index

A

Overcomes recursive problem by containing combinations of path expressions and a link to the node and parent node.
Essentially, its a table which gives us the index of what we want instead of having to traverse the tree itself.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Storing XML in shredded form

A

In other words, putting extracted information from XML into database.
XML is decomposed into its constituent elements and data distributed over number of attributes in one or more relations.
All of this makes it easier for indexing values of some elements, and can also provide hierarchy.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

NoSQL

A

“not only SQL” or “not relational”.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

NoSQL Advantages

A
  • Faster access to data.
  • Fault-tolerance.
  • Flexibility in data storage.
  • Full ACID compliance can sometimes be relaxed.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Web Service database split

A

NoSQL Database:
Used for slightly less important things, like user profiles or shopping carts. This data doesn’t always need to be 100% correct; if something is not updated properly, it isn’t devastating. As we can see, the way it is store is slightly less strict, and contains very simply queries.

Relational Database:
Very important data like credit card transactions; requires compliancy and security.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

NoSQL database setup

A

Typically distributed with nodes running on commodity hardware with a standard network.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

NoSQL database setup advantages

A
  • availability
  • consistency (not the same as acid)
  • scalability
  • high performance
  • partition tolerance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Availability

A

Every non-failing node always execute queries.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Consistency

A

Every read receives the most recent write, or an error.

26
Q

Scalability

A

More capacity by adding new nodes.

27
Q

High Performance

A

Often achieved by very simple interface.

28
Q

Partition Tolerance

A

Even if nodes fail, the remaining subnetworks can continue their work.

29
Q

CAP Theorem

A

States we cannot achieve consistency, availability and partition tolerance all at the same time.

30
Q

Consistency and Availability

A

Single node DBMS.

31
Q

Consistency and partition tolerance

A

NoSQL databases.

32
Q

Availability and partition tolerance

A

NoSQL databases.

33
Q

BASE

A

An alternative of ACID to compensate for the CAP theorem. Stands for Basically Available, Soft state, Eventually consistent.

34
Q

Basically Available

A

Rather than enforcing consistency, it will ensure availability of data instead.

35
Q

Soft state and Eventual consistency

A

The database state might occasionally be inconsistent but will eventually be made consistent.

36
Q

Common classification for NoSQL databases

A
  • key-value stores
  • document store
  • column stores
  • graph databases
37
Q

Key-value stores

A

Given a key and value, we can insert data into a database.
Given a key, we can find a value in a database.

38
Q

Available systems for key-value stores

A
  • Apache Cassandra
  • Amazon DynamoDB
  • Apache Voldemort
  • Memcached
  • Redis
  • Riak
39
Q

Distributed Storage

A

Each key value pair (k, v) is stored at some node.

40
Q

Distributed Storage Steps

A
  • Assign values v for key k to integer between 0 and (2^n)-1, in which (2^n)-1 gives us the amount of space to store places for nodes, as well as duplicates for these nodes. We do the hash function for these numbers.
  • Distribute nodes to some of the integers (typically random).
  • If (k,v) is assigned to integer i, then store at node following i.
41
Q

Adding new nodes

A

Can be done easily with horizontal fragmentation, aka we split C horizontally and store it at different locations.
Then, any key-value pairs that would have been in C that are stored in A are now transferred over to C.

42
Q

Replication

A

Ensures availability, storing copies of the key-value pairs on multiple nodes.
For example, if we have a key-value pair which is stored in the north-eastern A node, if A receives a duplicate then it will be stored in B.
If we receive multiple duplicates, we store one at each consecutive node.

43
Q

Properties for key-value stores

A
  • Scalability -> simple adding via horizonal fragmentation
  • Availability and fault-tolerance -> via replication.
  • High performance -> apply a hash function to determine anode, then ask the node. Same with writing.
44
Q

Eventual consistency for key-value stores

A

One of the problems with key-value stores is that we cannot ensure consistency due to CAP Theorem.
Therefore, we provide eventual consistency, which allows multiple versions of data item to be present at the same time (versioning).
If the newer version is not available, the older one is updated and used instead.

45
Q

Document Stores

A

Database which stores a collection of documents.
Document is essentially semi-structured data associated with an object id.
These documents are typically represented in JSON.

46
Q

JSON vs XML

A

For simplicity reasons, the difference is syntax alone.

XML:

<students>
<student>
<name>Anna</name>
<number>57904</number>
</student>
</students>

JSON:
{“students”:[
{“name “ : “Anna”,
“number” : 57904},

]}

47
Q

MongoDB Document Store commands

A
  • creating/managing collections
  • insert/update/delete documents
  • finding documents
  • indexing documents
48
Q

Creating/managing collections in MongoDB

A

db.createCollection(“students”)

49
Q

Insert/update/delete documents in MongoDB

A

db.students.insert({name: “Anna”})

50
Q

Finding documents in MongoDB

A

db.studuents.find({name: “Anna”})

51
Q

Indexing documents in MongoDB

A

db.students.createIndex({name: 1})

52
Q

Techniques in MongoDB

A
  • horizontal fragmentation -> collections are split into horizontal fragments based upon shard key, which is the indexed field in all documents.
  • replication -> horizontal fragments of collections are replicated.
53
Q

Column Stores

A

Hard to explain, but examples include Google Bigtable and Apache HBase.

54
Q

HBase commands

A
  • creating tables
  • inserting rows
  • finding documents
55
Q

Creating tables in HBase

A

create ‘STUDENT’, ‘Name’, ‘ID’

56
Q

Inserting rows in HBase

A

put ‘STUDENT’, ‘row1’, ‘Name:Fname’,’Anna’

57
Q

Finding documents

A

get ‘STUDENT’, ‘row1’
scan ‘STUDENT’

58
Q

Techniques in HBase

A

Fragmentation is split into two:
- Top level -> rows are divided into regions.
- Bottom level -> regions store different column families in different nodes.

59
Q

Timestamps in HBase

A

Each item has a timestamp and one can access past versions of the database if setup.

60
Q

Graph Databases

A

Simply, stores data as a graph.
Data is accessed using SQL-like path query language.
Also implements indexes.