Chapter 8 - XQUERY Flashcards
1
Q
Why was it necessary to introduce a new query language for XML?
A
- XML data is highly variable and complex: hirarchical with arbitrary nesting
- XML data is semi-structured (schema is optional) and self-describing (markup)
- Queries access meta-data or just values, and require severe structural transformations
- Order is relevant, and data may be sparse
2
Q
Overview of Xquery data model
A
- Free model of XML data: document is a tree, nodes are elements, attributes, text
- XDM is an extension: supports sequences of items (collection of nodes/atomic values)
- > intermediate and final results of query evaluation (closure property)
- > unnested, heterogeneous, typed items (XML schema)
- Nodes: have identity, 7 types, document order
3
Q
Path Expressions, their components and evaluation mechanism
A
- maps a context node to a sequence of nodes (initial/ implies document root)
- consist of a sequence of steps, each one containing:
- > AXIS: direction of navigation (target nodes) ->result in document order
- > Node test: type/name of qualifying nodes
- name test: element, attribute name, wildcard, namespace (ns:elem)
- type test: only nodes of specific type -> element( ), attribute( ), text( ), node( )
- support (name) and (name,type)
- Predicate (optional): filters the set of qualifying nodes (syntax: [ ])
- > boolean expression: selects nodes for which it evaluates to true
- > numeric expression: selects the node in a specific position
- > existence test: nodes for which expression does not evaluate to empty seq
- does not test value! person[@married] returns persons with attr “married”, regardless of its value
- Evaluation: step by step, from left to right, starting from external ctx or doc root
- > sequence of each step becomes context for next step
- iterate over input: sets each node as context and evaluates step
- > sort by document order, eliminating duplicates (distinct-document-order)
- empty sequence as result allowed
- > sequence of each step becomes context for next step
4
Q
Overview: other functions and expressions
A
- context functions: fn:last( ), fn:position( ), fn:current-date( )
- dereference: fn:id( ) = IDREF -> NODE
- COMPARISON WITH = != > EXISTENTIAL SEMANTICS,»_space; «_space;DOC ORDER
- boolean expressions: early out semantics, effective boolean value = false for ( ), 0, ‘’
- constructors for elements, attributes…
- > validation: in-scope schema references (type annotation)
- > strict (element must be defined), lax (type must match), skip (ignore)
- > context: schema path in which current node is validated
5
Q
What is the structure of FLOWR expressions and how are they evaluated?
A
- FOR and LET bind expressions to variables, creating a tuple stream
- > FOR iterates over values in the sequence
- WHERE filters the stream by evaluating an expression on the bound vars
- ORDER BY sorts the filtered stream
- RETURN applies an expression to construct the desired output
- > each tuple in the stream becomes an item in the result sequence
6
Q
What is the general FLOWR pattern to invert an hierarchy of nodes?
A
- bind for variable to distinct values of the deepest levels
- return nested element constructors with variable at the top
- > use parent/ancestor axes or subquery to insert nodes of upper levels
7
Q
Overview: additional XQuery operations
A
- order insignificance fn:unordered( ): provides optimization opportunity
- aggregate functions over sequences (bound by LET)
- JOIN: multiple for’s = cartesian product, where = predicate
- outer join: multiple for’s , unnest one of them out, to produce empty items
- > full outer join requires separate queries and further composition
- quantified expressions: existential (some…in…satisfies…), universal (every… in… satisfies…)
- defining functions: declare function ns:f($x as xs:integer) {expr}
- > no overload, casting of arguments: argument conditioning overloading
- simulate with typeswitch on body
- > no overload, casting of arguments: argument conditioning overloading
8
Q
What are the steps of Xquery processing?
A
- static analysis (optional): depends only on the query itself
- > static typing (catch errors early, guarantees result type, helps optimizing)
- dynamic evaluation: depends on input, computes result value (sequence)
- errors during static typing may still execute correctly at run time (node atomizations)
9
Q
What is the general purpose and the different operators of Xquery Update?
A
- Potentially modify the state of existing node (side-effects)
- full generality: can occur whenever an expression is expected
- snapshot semantics: collect a pending updates list (primitives: target node + operation)
- > applied only after outmost expression is evaluated
- operations:
-> insert: insert copies of one or more nodes into designated position (target node)
insert ([as (first|last) into | after | before)
-> delete: delete **zero or more nodes
-> replace: replace node with a new seq. of nodes, or value of target node
replace [value of] with
-> rename: rename as
-> transform: creates modified copy of existing nodes (->bit yodatubg exor!!) **no side effects
transofr copy (<var> := ) + modify return</var>
10
Q
How does XQuery evaluate multiple updating expressions?
A
- conflicting primitives in pending list: forbids some node as target of many rename/replaces
(replace of upper nodes wins over descendants) - performs update primitives in a defined order:
- unordered inserts, replace values, rename, delete (just mark)
- ordered inserts
- replace node
- replace element content
- actual delete