Data formats Flashcards
What types of data are there?
- Unstructured
- Semi-structured
- Structured
What are the characteristics of structured data?
- Easy to analyse, query and store
- Easy to clean, maintain consistency and security of data
What are the characterictics of unstructured data?
- Hard to index
- Hard to organise
- Lacks regularity and decomposable internal structure
What are some examples of semi-structured data?
- CSV
- HTML
- XML
- JSON
What are the characteristics of CSV’s?
- Stores tabular data
- Just a delimited text file
- Lacks format infomation
- Contains no formulas or macros
What are the characteristics of HTML?
- Marked up with elements delineated by start and end tags
- Elements correpsond to logical units
- Tags are key words that are contained within pairs of pointed brackets
- Browser determines how to display logical units
- Not all elements need both a start and end tag
- Elements have attributes
What are the limitations of HTML?
- Designed for presentation purposes
- Not converned with meaning just formatting
- Not extensible
- Inconsistently applied
What are the characteristics of XML?
- Meta markup language
- User defined tags
- Facilitate better encoding of semantics
How are XML elements structured and what syntax is used?
- One root element
- Appropriate nesting of elements
- Start and end tags
- Attributes in quotes
- Case sensitive
How must an XML document begin?
-
How do you comment in XML?
Are all chars available to use in XML?
Some characters have special meaning in XML however there are alternative ways to encode this
How do you insert large amounts of text in XML?
Using
What is JSON?
A data interchange format that that is built for lightweight data storage
What are the advantages of JSON over XML?
- JSON is more streamlined, lightweight and compressed
- Is easier to parse generally
- Used to read and display data from a webserver