Databases Flashcards
Entity-relationship model
Graphical description of a database, represented in UML notation. An entity type is a group of objects with the same properties and independent existence. An entity occurrence is a uniquely identifiable instance of an entity type.
Candidate and primary keys
A primary key is one which has been chosen to uniquely identify an entity occurrence, a candidate key is one that could.
Simple and composite keys
A candidate key that consists of one attribute
A candidate key that consists of many attributes
Advantages of EER model over ER
Avoids describing similar concepts more than once
Can have relations that include a subclass but not the superclass
More semantic information in the design (e.g. manager IS-A member of staff)
How does the EER model work?
A subclass is a subgrouping of occurrences of an entity type, which need to be represented separately. A superclass is an entity type that has two or more distinct subclasses. Example: Staff contains Manager and Secretary.
All attributes of the superclass are also attributes of the subclasses, and a subclass has additional attributes than its superclass.
Specialisation
Top-down process of maximising the differences between entity occurrences by identifying their distinguishing characteristics. Given superclasses, it leads to identifying subclasses.
Generalisation
Bottom-up process of minimising the differences between entity occurrences, by identifying their common characteristics.
Participation constraint
Determines whether every member in the superclass must participate as a member of one of the subclasses (either mandatory or optional)
Disjoint constraint
Determines whether a member of a superclass can be a member of one or more subclasses (or=disjoint=only one subclass, and=non-disjoint=multiple subclasses)
Structured data
This is data represented in a strict format, such as the relational data model. The database management system checks to ensure that the data follows the structures and referential constraints specified in the schema.
Semi-structured data
This is data that may be irregular or incomplete and may have a structure that changes rapidly/unpredictably. It may have some structure, but not all parts may have the same structure and each data object can have different attributes that are not known in advance. We end up with such data when we collect data ad-hoc. Information that usually belongs to the schema is now mixed in with the data itself
Advantages of XML
Simple standard, human-legible, extensible (can define own tags), platform-independent, separation of content and presentation (allows custom view of data), allows repeated attributes
Well-formed XML
Well-formed: single root element, matching tags (properly nested), initial XML declaration including XML version number, encoding, and standalone status (DTD or no DTD)
Type-valid XML
Well-formed and the elements must follow the pre-defined structure defined in the DTD.
Schema-valid XML document
Must be well-formed and conforms to an XML schema
DTD attribute syntax
<!ATTLIST BAR topic (sushi | sports) “sushi”>
This means that a BAR has an attribute “topic” which can either be “sushi” or “sports” but the default is sushi
XQuery syntax
FOR $variableName in <XPath>
WHERE <condition>
RETURN {<expression>}</expression></condition></XPath>
Write “doc(“bib.xml”)/bib/book[year > 1995]/title” in XQuery
FOR $x IN doc(“bib.xml”)/bib/book
WHERE $x/year > 1995
RETURN {$x/title}