Big Data Lecture 02 Lessons Learnt Flashcards
Explain data independence
Logical model (interface) of the data (queries, and displaying) is independent from the physical storage (can be swapped).
What 4 pieces constitute the architecture of data storage?
<ul><li>Language (how you query),</li><li>model (representation, driver of independence),</li><li>compute (execution of computation),</li><li>storage (physical hardware).</li></ul>
What does the data model describe? (2)
<ul><li>What the data looks like,</li><li>what you can do with it (manipulation primitives).</li></ul>
What is a table?
Collection of rows with different attributes.
What is a row?
One record in the table.
What is an attribute?
One column in a table.
What is a primary key?
Unique key that identifies the record in a table.
What is a value?
One input in a row and a column of a table.
What is relational algebra?
Algebra to express operations on a table.
Relation table expressed formally in relational algebra?<br></br><br></br>What are its two components?
Each attribute has its domain, the relation is a subset of cross product of these domains, tuples of which we now put into a table.<br></br><br></br>Components:<br></br>1. set of attributes (schema),<br></br>2. set/bag/list of tuples.
Explain: set, list, and bag.
<ul><li>Set: unordered collection without duplicates,</li><li>list: ordered collection, can have duplicates,</li><li>bag: unordered collection, with duplicates.</li></ul>
How can tuple be seen as a function?
It assigns to each attribute of a table a value.
What is relational integrity?
All the attributes must have a correct reference, meaning that the keys point to valid records in other tables.
What is atomic integrity?
There are no tables in a table, every value is atomic.
When is table 1st normal form?
Table must follow atomic integrity.