XML Flashcards
Explain SSD.
Semi-structured data (SSD)
• More flexible data model than the
relational model.
– Think of an object structure, but with the type
of each object its own business.
– Labels to indicate meanings of substructures.
• Semi-structured: it is structured, but not
everything is structured the same way!
Explain SSD Graphs.
SSD Graphs
• Nodes = ”objects”, ”entities”
• Edges with labels represent attributes or
relationships.
• Leaf nodes hold atomic values.
• Flexibility: no restriction on
– Number of edges out from a node.
– Number of edges with the same label
– Label names
Structure of SSD graphs.
Explain Schemas for SSD.
Schemas for SSD
• Inherently, semi-structured data does not
have schemas.
– The type of an object is its own business.
– The schema is given by the data.
• We can of course restrict graphs in any
way we like, to form a kind of ”schema”.
– Example: All ”course” nodes must have a
”code” attribute.
SSD examples.
SSD Examples
• XML
– 90’s
– Case Sensitive
– <open_tag>... or <tag></tag> – <!--<br/>comments --><br></br>• JSON<br></br>– 2000<br></br>– Collection of key/value pairs (hash table, associative<br></br>array)<br></br>– Begins with { and ends with }<br></br>– Each key is followed by : (colon) and the key/value<br></br>pairs are separated by , (comma)</open_tag>
Describe XML
XML
• XML = eXtensible Markup Language
• Derives from document markup
languages.
– Compare with HTML: HTML uses ”tags” for
formatting a document, XML uses ”tags” to
describe semantics.
• Key idea: create tag sets for a domain,
and translate data into properly tagged
XML documents.
Explain XML structure.
Explain code structure of XML.
XML explained
• An XML element is denoted by surrounding tags:
<course>...</course>
• Child elements are written as elements between the tags
of its parent, as is simple string content:
<course><givenin>2</givenin></course>
• Attributes are given as name-value pairs inside the
starting tag:
<course>…</course>
• Elements with no children can be written using a shorthand:
<course></course>
Where are starting tags, attributes, child elements and string content placed? Is XML case sensitive?
Explain XML namespaces.
Well-formed and valid XML?
Explain DTD.
DTDs
• DTD = Document Type Definition
• A DTD is a schema that specifies what
elements may occur in a document, where
they may occur, what attributes they may
have, etc.
• Essentially a context-free grammar for
describing XML tags and their nesting
Explain ID & IDREF.
What’s wrong with DTDs?
What’s wrong with DTDs?
• Only one base type – CDATA.
• No way to specify constraints on data other than
keys and references.
• No way to specify what elements references may
point to – if something is a reference then it may
point to any key anywhere.
• DTD is not a XML!!
Explain XML schema.
XML Schema
• Basic idea: why not use XML to define schemas
of XML documents?
• XML Schema instances are XML documents
specifying schemas of other XML documents.
• XML Schema is much more flexible than DTDs,
and solves all the problems listed and more!
• DTDs are still the standard – but XML Schema is
the recommendation (by W3)!
Explain XPath.
Explain Axes.
Axes
• The various directions we can follow in a
graph are called axes (sing. axis).
• General syntax for following an axis is
– Example: /Courses/child::Course
• Only giving a label is shorthand for
child::label, while @ is short for
attribute::
Some other useful axes are:
– parent:: = parent of the current node.
• Shorthand is ..
– descendant-or-self:: = the current node(s) and all
descendants (i.e. children, their children, …) down
through the tree.
• Shorthand is //
– ancestor::, ancestor-or-self = up through the tree
– following-sibling:: = any elements on the same level
that come after this one.
– …
Write an XPath expression that gives the
courses that are given in period 2, but with
only the GivenIn element for period 2 as a
child!
It can’t be done!
XPath is not a full query language, it only allows us
to specify paths to elements or groups of elements.
We can restrict in the path using [] notation, but we
cannot restrict further down in the tree than what
the path points to.
Explain XQuery.
XQuery
• XQuery is a full-fledged querying language
for XML documents.
– Cf. SQL queries for relational data.
• XQuery is built on top of XPath, and uses
XPath to point out element sets.
• XQuery is a W3 recommendation.
Explain FLWOR
FLWOR
• Basic structure of an XQuery expression is:
– FOR-LET-WHERE-ORDER BY-RETURN.
– Called FLWOR expressions (pronounce as flower).
• A FLWOR expression can have any number of
FOR (iterate) and LET (assign) clauses, possibly
mixed, followed by possibly a WHERE clause
and possibly an ORDER BY clause.
• Only required part is RETURN.
Comparing items in XQuery
• The comparison operators eq, ne, lt, gt, le and
ge can be used to compare single items.
• If either operand is a sequence of items, the
comparison will fail.
Summary XML
Summary XML
• XML is used to describe data organized as documents.
– Semi-structured data model.
– Elements, tags, attributes, children.
– Namespaces.
• XML can be valid with respect to a schema.
– DTD: ELEMENT, ATTLIST, CDATA, ID, IDREF
– XML Schema: Use XML for the schema domain to describe your
schema.
• XML can be queried for information:
– XPath: Paths, axes, selection
– XQuery: FLWOR.
Cartesian product in XQuery.
Explain Aggregations in XQuery.
Joins in XQuery
Sorting in XQuery
Quantification in XQuery
Updating XML
XQuery Update