Big Data Flashcards
1
Q
What is volume?
A
- Information is now stored about online communications, purchases and transactions which produce a large amount of data, stored for a possibly infinite amount of time
- Unstructured data streaming from social media also forms another source of data
- As information has become possible to store digitally, there has been a decrease in the cost of storing it
- As a result, other issues have emerged, such as deciding what data is relevant
2
Q
What is velocity?
A
Data is streaming at unprecedented speed and must be dealt with promptly
3
Q
What is variety?
A
- Data today comes in all types of formats
- Structured, numeric data in traditional databases as well as unstructured text documents, email, video, audio, stock market data and financial transactions
4
Q
What is variability?
A
- Data flows can be highly inconsistent
- E.g., where an event or idea is trending in social media, it suddenly becomes popular and high profile
- Daily, seasonal, and event-triggered peak data loads can be challenging to manage
- This is even more with unstructured data
5
Q
What is complexity?
A
- Today’s data comes from multiple sources
- And it is still an undertaking to link, match, sort and transform data across systems, as well as to connect and correlate relationships