E10 Flashcards

Question 1

Q

The main idea of the Lambda Architecture is…

Answer

A

To build Big Data systems as a series of layers

Each layer satisfies a subset of the desired properties and builds upon the functionality provided by the layers above it

Question 2

Q

Batch Layer

Answer

A

Question 3

Q

Speed Layer

Answer

A

Accommodates all requests that are subject to low latency requirements
Does incremental computation instead of the recomputation done in the batch layer

Question 4

Q

Speed Layer Goal

Answer

A

To have updated information on what happened since the last batch view was generated

Question 5

Q

Serving Layer Goal

Answer

A

To merge views created by the batch layers with views created by the speed layer

Question 6

Q

Difference Batch and Speed Layer

Answer

A

One big difference is that the speed layer only looks at recent data, whereas the batch layer looks at all the data at once

Question 7

Q

Serving Layer

Answer

A

Indexes batch views so that they can be queried with low latency
The serving layer is a specialized distributed database that loads in a batch view and makes it possible to do random reads on it
When new batch views are available, the serving layer automatically swaps those in so that more up-to-date results are available
It does not need to support specific record updates
- > This is a very important point, as random writes cause most of the complexity in databases

Question 8

Q

Storing data in raw format has many advantages:

Answer

A

Data is always true (or correct): all records are always correct; no need to go back and re-write existing records; you can simply append new data
You can always go back to the data and perform queries you did not anticipate when building the system

Question 9

Q

Data should be stored in raw format, should be

Answer

A

- Kept forever

(9 cards)