Lecture 3/4: Big Data Fundamentals Flashcards
1
Q
5 V’s of Big Data
A
- Volume
- Variety
- Velocity
- Veracity
- Value
2
Q
Distributed Storage Systems
A
- HDFS
- Cloud storage
3
Q
Data Processing Frameworks
A
- MapReduce
- Apache Spark
4
Q
Real-Time Streaming Technologies
A
- Kafka
- Storm
- Flink
5
Q
ETL Pipelines
A
- Extract, transform, load (ETL) processes used to integrate data from different sources into a centralized repository
6
Q
Challenges with Big Data (Lecture)
A
- Scalability
- Integration
- Quality and consistency
- Security and privacy
- Real-time processing