3/6 - XML Flashcards
What does XML stand for?
eXtensible Markup Language
What does XML do?
- Use to add lightweight semantic info (what the code means) to plain text
- It can describe complex structured data
- It is not a programming language, it’s just data
What does DTD stand for?
Document Type Definition
What is DTD?
- Just a general structure for a family of XML documents
- We can have one DTD (specifies the grammar/syntax) and have many XML document that obey that DTD
- This is similar to having one HTML syntax and many HTML pages that obey that syntax
XPath
- Compact syntax for grabbing fragments of XML data
- Vaguely similar to regular expressions
- It is a way of picking out a piece of an XML document
What does XSLT stand for?
eXtensible Stylesheet Language Transformation
What is XSLT?
- It is similar to a programming language, since it has conditionals, loops etc
- It is a language for transforming XML
- Uses XPath
What are Markup Languages?
- Markup languages give structure and meaning to plain text
- The idea behind markup languages is to add some semantic info (meaning or structure) to plain text
- They are like a lightweight overlay on top of your plain text
Creating Your Own Markup Language
- HTML is one of the markup language. It is used for describing web pages
- If web pages are not the things you are interested in, then you can make your own markup vocabulary.
- XML provides a generic syntax for defining your own custom markup language
XML start and end tags
angle brackets (e.g. basil )
XML how to delimit elements
- Tags delimit elements.
* The whole thing from the start tag (<>) to the end tag (>) defines an element
Where are XML attributes placed, and are they required?
Start tags have attributes which are optional. Duplicate attribute name is not allowed.
XML shorter syntax for empty elements
e.g.
XML element and attribute name
We can make up any element and attribute name
How to use XML special characters?
Special characters cannot be used in XML unless you escape them
• “” has to be used as >
• “&” has to be used as &
• Characters not in ASCII have to be used by their Unicode representation
Nesting XML elements
Elements must be strictly nested and explicitly closed
Think of elements as nested matching parentheses.
root element
the top level element (there is exactly ONE)
XML docs as trees
Every well-formed XML document is a tree
Strict nesting determines parent child relationship
What are the nodes in the tree?
Elements are nodes
• May have zero or more ordered children
What are the leaf nodes?
Runs of original text become leaf nodes (they cannot have any children)
What purpose do attributes serve, in the context of a tree?
Attributes are extra info on elements
• Collection of key value pairs like {name: value}
Are attributes ordered?
Attributes are unordered unlike child nodes
Extra parsing of attribute values
No extra parsing or interpretation of attribute values
Well Formed XML
Requires only proper syntax, nesting, entity escaping
• Sufficient to ensure that one can construct an unambiguous tree
• Every XML parser will interpret your XML in the same way
• XML is used to exchange data between different computers, so every parser must have a consistent interpretation.
• If XML is not well formed then it is not XML