Big Data Flashcards
BigQuery
- serverless, high scalable and cost-effective cloud data warehouse
- analyze large datasets quickly and easily
Cloud Dataproc
- managed Hadoop and Spark service to process large datasets
- good choice for running batch and streaming data processing jobs
Cloud Dataflow
- fully- managed real-time data processing service
- for batch and streaming Big Data processing
- good choice for running real-time data processing jobs
Cloud Data Fusion
- cloud-native, fully-managed, enterprise data integration service
- good choice for integrating data from a variety of sources
Cloud Pub/Sub
-fully-managed, real-time messaging service between apps
- good choice to send and receive messages reliably and efficiently
Cloud Dataprep
-data preparation tool to clean, transform and integrate data for analysis
- prepare data for ML and big data analysis
Cloud Datalab
-interactive data analysis and exploration tool to visualize and analyze data in a web browser
- good choice if you want to analyze data without writing code
Cloud Data Catalog
-metadata management service to organize, discover and manage data assets
- good choice if you need to manage large number of data assets and make them accessible to their users