Lecture 4-6-7 Flashcards

1
Q

Optimizations to make models smaller or reduce the computational needs

A

* Node pruning: throw away weights close to 0. Retrain the remaining network.
* standard quantization: after training the model in FP32, switch to fixed-point numbers
* mixed-precision training: compute gradients in fp16 but kep the weights in fp32.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Approaches for parallel training:

A

* hyper-parameter search: build independent models.
* data parallel training
* model parallel: split the model. works well for convolution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

streaming: windows

A

mechanism for extracting a finite relation from infinite stream

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

window types

A

count-based, ordering-based, punctuation-based

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the problems with RDBMSs?

A

* Must design from the beginning
* Requires two-phase commit: slow
* Expensive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the features of NoSQL?

A

* Horizontally scale
* Replicate/distribute data over many servers
* Simple call interface
* Flexible schemas

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is ACID?

A

* Atomicity: Either the task (or all tasks) within a transaction are performed or none of them are. This is the all-or-none principle. If one element of a transaction fails the entire transaction fails.
* Consistency: The transaction must meet all protocols or rules defined by the system at all times. The transaction does not violate those protocols and the database must remain in a consistent state at the beginning and end of a transaction; there are never any half-completed transactions.
* Isolation: No transaction has access to any other transaction that is in an intermediate or unfinished state. Thus, each transaction is independent unto itself. This is required for both performance and consistency of transactions within a database.
* Durability: Once the transaction is complete, it will persist as complete and cannot be undone; it will survive system failure, power loss and other types of system breakdowns.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is BASE?

A

* *Basically Available:*This constraint states that the system does guarantee the availability of the data as regards CAP Theorem; there will be a response to any request. But, that response could still be ‘failure’ to obtain the requested data or the data may be in an inconsistent or changing state, much like waiting for a check to clear in your bank account.
* *Soft state:*The state of the system could change over time, so even during times without input there may be changes going on due to ‘eventual consistency,’ thus the state of the system is always ‘soft.’
* *Eventual consistency:*The system will/eventually/become consistent once it stops receiving input. The data will propagate to everywhere it should sooner or later, but the system will continue to receive input and is not checking the consistency of every transaction before it moves onto the next one.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Compare NoSQL nad RDBMS

Data format

Scalability

Querying

Storage mechanism

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is NoSQL, what are its features?

A

When compared to relational databases, NoSQL databases are more scalable and provide superior performance, and their data model addresses several issues that the relational model is not designed to address:

Large volumes of structured, semi-structured, and unstructured data

Agile sprints, quick iteration, and frequent code pushes

Object-oriented programming that is easy to use and flexible

Efficient, scale-out architecture instead of expensive, monolithic architecture

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

When is NoSQL mostly used?

A

noSQL systems are used typically to keep user state (say the OLTP database) for many users, and for which key-value lookups and (key,value) updates are sufficient. Their role in Big Data pipelines is thus often in the entry stage to capture events as they happen – these events are logged and accumulated and periodically stored e.g. on HDFS for further analysis. However, as explained in the Lambda Architecture (see the streams lecture), noSQL systems are also often used to serve out precomputed views, where this computation is some kind of Big Data analysis pipeline. Hence, they also serve as a final station to serve out e.g. recommendations, statistics, preferences, suggestions etc – as such closing the loop of Big Data architectures.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly