11 - Scaling Up: Big Data Flashcards

Question 1

Q

Parallel processing

Answer

A

parallelism: work on separate pieces at the same time
– challenges = coordination, mutability, blocking
distributed computing: same but != CPU, machine
– challenges = sending instruction, fault tolerance, data storage and retrieving

Question 2

Q

Programming: imperative, declarative

Answer

A

imperative = direct orders, manual scheduling and data ctl, optimize perf possible (C, C++, Java, Matlab)
declarative = state goals, data automatically managed and stored, automatic scheduling but not necessarily efficient (SQL, R, Python can be)

Question 3

Q

Queue computing

Answer

A

Question 4

Q

Databases

Answer

A

- NOSQL (beyond)

Question 5

Q

Big data

Answer

A

Question 6

Q

Further big data

Answer

A

- cloud computing: Spark