4.11 Big Data Flashcards
1
Q
Big Data Has to be Big In (3x2)
A
- VOLUME
- Must require more than one server to store all the data
- Large amount of data
- VELOCITY
- How fast is data being inputted?
- How quickly do we need to process it?
- VARIETY
- How much does the data vary?
- Does the data vary largely in values?
2
Q
Fact Based Modelling (2)
A
- Allows you to deconstruct into units called facts
- Facts placed within a master dataset
3
Q
Master Dataset (definition)
A
Ever growing list of immutable, atomic facts
4
Q
Principles of Fact Based Modelling (5)
A
- Raw data stored as atomic facts
- Each fact must be identifiable
- Facts capture as single piece of info
- Facts are immutable
- Facts are eternally true due to timestamp
5
Q
Advantages of Fact Based Modelling (6)
A
- Simple - no indexing
- New items just added
- Data always true
- Historical queries are easy to run
- Facts are immutable and time stamped
- Errors easy to correct by returning to earlier facts