Big Data Lecture 01 Introduction Flashcards
What is the main learning objective of the course?
Learn to query gigantic amounts of data even when it is a bit messy.
How is Data Science similar to Physics?
It is epistemic science of artificial data, so it has the same relation as Physics has to Mathematics, but to Computer Science.
What was the first human data transmitting manner and what was its problem, and how was it solved?
People would speak or sing, however, this would get distorted over time. Solved by writing.
What was the first data storing format? What is its problem and how was it solved?
Clay tablet table, tables are the most natural form of storing data.<br></br><br></br>Problematic copying, this was solved by the printing press.
How was data stored in computers in history?
- 1960s: file base systems
- 1970s: relational databases
- 2000s: NoSQL era
Three Vs of Big Data
- Volume
- Variety
- Velocity
- (Veracity)
Why do we store more data?
<ul><li>We can, storage is cheap.</li><li>It carries value.</li><li>Combined data is worth more than sum of its parts.</li><li>We need data totality, some sites only operate well if they have all the data.</li></ul>
Name prefix for unit: 1 000 (3 zeros)
kilo (k)
Name prefix for unit: 1 000 000 (6 zeros)
Mega (M)
Name prefix for unit: 1 000 000 000 (9 zeros)
Giga (G)
Name prefix for unit: 1 000 000 000 000 (12 zeros)
Tera (T)
Name prefix for unit: 1 000 000 000 000 000 (15 zeros)
Peta (P)
Name prefix for unit: 1 000 000 000 000 000 000 (18 zeros)
Exa (E)
Name prefix for unit: 1 000 000 000 000 000 000 000 (21 zeros)
Zetta (Z)
Name prefix for unit: 1 000 000 000 000 000 000 000 000 (24 zeros)
Yotta (Y)