Cloud Dataflow Flashcards
1
Q
Cloud Dataflow description
A
Fully managed service for creating data (batch and stream) processing pipelines where data is collected, transformed and then output
2
Q
What are the key features of Cloud Dataflow? (7)
A
- Based on Apache Beam
- Process data on multiple machines in parallel.
- Handles streaming data like Cloud Pub/Sub
- Handles batch or archived data like Cloud BigQuery
- Serverless
- Templates for ease of replication
- Best choice if not using Apache Hadoop or Spark
3
Q
Where does Cloud Dataflow deliver its output?
A
BigQuery, Cloud Machine Learning, Cloud Bigtable
4
Q
3 examples of Cloud Dataflow
A
- Analytical dashboards
- Forecasting Sales Trends
- ETL