Big Data Lecture 15 Final Quiz Flashcards

1
Q

Which one of these languages is not declarative?<br></br><ul><li>PGQL</li><li>Java</li><li>SQL</li><li>JSONiq<br></br></li></ul>

A

Java

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

SQL was tailor-made for relations fulfilling tabular integrity, domain integrity and atomic integrity. JSONiq relaxes these conditions, except:<br></br><ul><li>Relational integrity</li><li>Domain integrity</li><li>Atomic integrity</li><li>none of them: JSONiq relaxes all three.</li></ul>

A

None of them: JSONiq relaxes all three.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What shape of data is most adequate to search for relevant textual content?<br></br><ul><li>Trees</li><li>Tables</li><li>Graphs</li><li>Vectors</li><li>Cubes</li></ul>

A

<div>Vectors.</div>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Which one(s) of the following data shapes were supported in extensions (data types, additional clauses, etc) to the SQL language (standardized or not)?<br></br><ul><li>Trees (JSON, XML)</li><li>Graph (Pattern matching)</li><li>Cubes (group by/rool up)</li><li>Vector (vector data type, cosine proximity)</li></ul>

A

All of them!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Which one of the following clauses does not exist in PostgreSQL?<br></br><ul><li>GROUP BY CUBE</li><li>GROUP BY ROLLUP</li><li>GROUP BY GROUPING SETS</li><li>GROUP BY GRID</li></ul>

A

GROUP BY GRID

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the maximum (order of magnitude) size of data that you can store in a single HBase cell without causing performance issues?<br></br><ul><li>100 kB</li><li>10 MB</li><li>5 GB</li><li>5TB</li></ul>

A

10 MB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the current system used by Google, which succeeded BigTable with support for ExaBytes of data?<br></br><ul><li>Spanner</li><li>Spinnstduwohl</li><li>Kubernetes</li><li>HBase</li></ul>

A

Spanner

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Which of the following is not an atomic data type?<br></br><ul><li>String,</li><li>array,</li><li>date,</li><li>hexBinary.</li></ul>

A

Array (is structural).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Which of the following is not a physical evaluation pattern in a query engine?<br></br><ul><li>Materialized execution</li><li>Streamed execution</li><li>Parallel execution</li><li>Demateralized execution</li></ul>

A

Demateralized execution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

In 2014, Spark broke the GraySort record. How much time did it take for them to sort 100TB of data?<br></br><ul><li>34 seconds</li><li>1.4 minutes</li><li>23 minutes</li><li>3 hours</li></ul>

A

23 minutes!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Why is Hadoop called Hadoop?<br></br><ul><li>It was randomly produced with a password generator.</li><li>It is a distorition of “add up”, which refers to aggregation MapReduce uses most while reducing.</li><li>It is a preference to an episode of The Simpsons with a lot of doughnuts.</li><li>It is named after the elephant toy of Doug Cutting’s son.</li></ul>

A

It is named after the elephant toy of Doug Cutting’s son.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Is the document {“foo”: [1, 2, 3]} valid JSON?<br></br><ul><li>Yes</li><li>No</li><li>The question is not precise enough.</li><li>It depends on whether duplicate key are allowed or not/</li></ul>

A

The question is not precise enough. (There is no schema given.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is index-free adjancency?<br></br><ul><li>Instead of joins, the data is structured and linked using native pointers in memory.</li><li>The adjacency matrix is indexed on rows and columns for efficient traversal.</li><li>Relational joins are used instead of adjacency matrices.</li><li>It is a technique with which documents are sorted along a given field, which allows binary search instead of an index lookup.</li></ul>

A

Instead of joins, the data is structured and linked using native pointers in memory.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Do document stores have schemas for their documents?<br></br><ul><li>Yes, like relational databases. It is a fundamental design principle.</li><li>No. There are no schemas at all in SQL, this is incompatible with documents.</li><li>This is optional. It is possible to associate a schema with a collection but it must be done before populating data.</li><li>This is optional. It is possible to associate a schema with a collection and this can also be done after populating the data.</li></ul>

A

This is optional. It is possible to associate a schema with a collection and this can also be done after populating the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

These words are (on a high level) synonyms except for one. Which one?<br></br><ul><li>Blocks,</li><li>chunks,</li><li>shards,</li><li>slots.</li></ul>

A

Slots, the rest relate to partitioning of the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a vector clock?<br></br><ul><li>It is a versioning and conflict-resolution scheme for the key-value store Amazon Dynamo.</li><li>It is a timestamp that gives universal scale of time to keep a key-value store like Amazon Dynamo consistent.</li><li>It is a part of the cubic model in OLAP; it provides a time dimension to organize values by year and quarter.</li><li>It is a device used to confirm relativity theory with Airbus 380s.</li></ul>

A

It is a versioning and conflict-resolution scheme for the key-value store Amazon Dynamo.

17
Q

How many bytes are there in a quettabyte?<br></br><ul><li>10^9</li><li>10^33</li><li>10^30</li><li>10^27</li></ul>

A

10^30

18
Q

How many petabytes are there in a quettabyte?<br></br><ul><li>1, these are synonyms,<br></br></li><li>one trillion,</li><li>one quadrillion,</li><li>one quintillion.</li></ul>

A

One quadrillion!

19
Q

Which of the following is not a cloud storage system offered as a service?<br></br><ul><li>Azure Blob Storage</li><li>Gougle Cloud Storage</li><li>HDFS</li><li>S3</li></ul>

A

HDFS