Big Data Lecture 15 Final Quiz Flashcards
Which one of these languages is not declarative?<br></br><ul><li>PGQL</li><li>Java</li><li>SQL</li><li>JSONiq<br></br></li></ul>
Java
SQL was tailor-made for relations fulfilling tabular integrity, domain integrity and atomic integrity. JSONiq relaxes these conditions, except:<br></br><ul><li>Relational integrity</li><li>Domain integrity</li><li>Atomic integrity</li><li>none of them: JSONiq relaxes all three.</li></ul>
None of them: JSONiq relaxes all three.
What shape of data is most adequate to search for relevant textual content?<br></br><ul><li>Trees</li><li>Tables</li><li>Graphs</li><li>Vectors</li><li>Cubes</li></ul>
<div>Vectors.</div>
Which one(s) of the following data shapes were supported in extensions (data types, additional clauses, etc) to the SQL language (standardized or not)?<br></br><ul><li>Trees (JSON, XML)</li><li>Graph (Pattern matching)</li><li>Cubes (group by/rool up)</li><li>Vector (vector data type, cosine proximity)</li></ul>
All of them!
Which one of the following clauses does not exist in PostgreSQL?<br></br><ul><li>GROUP BY CUBE</li><li>GROUP BY ROLLUP</li><li>GROUP BY GROUPING SETS</li><li>GROUP BY GRID</li></ul>
GROUP BY GRID
What is the maximum (order of magnitude) size of data that you can store in a single HBase cell without causing performance issues?<br></br><ul><li>100 kB</li><li>10 MB</li><li>5 GB</li><li>5TB</li></ul>
10 MB
What is the current system used by Google, which succeeded BigTable with support for ExaBytes of data?<br></br><ul><li>Spanner</li><li>Spinnstduwohl</li><li>Kubernetes</li><li>HBase</li></ul>
Spanner
Which of the following is not an atomic data type?<br></br><ul><li>String,</li><li>array,</li><li>date,</li><li>hexBinary.</li></ul>
Array (is structural).
Which of the following is not a physical evaluation pattern in a query engine?<br></br><ul><li>Materialized execution</li><li>Streamed execution</li><li>Parallel execution</li><li>Demateralized execution</li></ul>
Demateralized execution
In 2014, Spark broke the GraySort record. How much time did it take for them to sort 100TB of data?<br></br><ul><li>34 seconds</li><li>1.4 minutes</li><li>23 minutes</li><li>3 hours</li></ul>
23 minutes!
Why is Hadoop called Hadoop?<br></br><ul><li>It was randomly produced with a password generator.</li><li>It is a distorition of “add up”, which refers to aggregation MapReduce uses most while reducing.</li><li>It is a preference to an episode of The Simpsons with a lot of doughnuts.</li><li>It is named after the elephant toy of Doug Cutting’s son.</li></ul>
It is named after the elephant toy of Doug Cutting’s son.
Is the document {“foo”: [1, 2, 3]} valid JSON?<br></br><ul><li>Yes</li><li>No</li><li>The question is not precise enough.</li><li>It depends on whether duplicate key are allowed or not/</li></ul>
The question is not precise enough. (There is no schema given.)
What is index-free adjancency?<br></br><ul><li>Instead of joins, the data is structured and linked using native pointers in memory.</li><li>The adjacency matrix is indexed on rows and columns for efficient traversal.</li><li>Relational joins are used instead of adjacency matrices.</li><li>It is a technique with which documents are sorted along a given field, which allows binary search instead of an index lookup.</li></ul>
Instead of joins, the data is structured and linked using native pointers in memory.
Do document stores have schemas for their documents?<br></br><ul><li>Yes, like relational databases. It is a fundamental design principle.</li><li>No. There are no schemas at all in SQL, this is incompatible with documents.</li><li>This is optional. It is possible to associate a schema with a collection but it must be done before populating data.</li><li>This is optional. It is possible to associate a schema with a collection and this can also be done after populating the data.</li></ul>
This is optional. It is possible to associate a schema with a collection and this can also be done after populating the data.
These words are (on a high level) synonyms except for one. Which one?<br></br><ul><li>Blocks,</li><li>chunks,</li><li>shards,</li><li>slots.</li></ul>
Slots, the rest relate to partitioning of the data.