BigQuery Flashcards
BigQuery description
petabyte scale analytics database service for data warehousing
BigQuery key points (9)
- serverless
- uses standard sql queries
- near real-time interactive analysis of massive data sets
- can access info stored in Cloud Storage, Cloud SQL, Bigtable and Google Drive
- Storage and computing are handled and billed separately
- Automatic data replication
- Can modify data with DLL
- Can query public or commercial data sets.
- High availability
What are 3 BigQuery use cases?
- Real-time Inventory
- Predictive Marketing
- Analytical Events
What Google apps can BigQuery access?
- Cloud Storage
- Cloud SQL
- Cloud Bigtable
- Google Drive
How do you estimate the cost of a BigQuery query via shell?
You run the query with a flag of –dry_run set
bq –location=[LOCATION] query –use_legacy_sql=false –dry_run [SQL QUERY]
What are jobs in BigQuery?
Processes used to load, export copy and query data. Jobs are automatically started when you start one of these processes.
How do you view the status of a BigQuery job? (Shell and console)
Console - click job history from BQ console
Shell - bq –location=[LOCATION] show -j [JOB ID]
How do you export BQ data from the console?
go to BigQuery -> Resources, open the dataset containing the table to be exported and select the table. Export options are on upper right
Where can you export BQ data to?
Cloud Storage or Data Studio (a GCP analysis tool)
How do you import BQ data via console?
go to BiqQuery -> Resources and select a dataset to import into.
Click create table tab.
Select a source,
file format (if source is not empty table)
table type (external or native - if external, data is kept in source location and only metdata about the table is stored in BigQuery
table name
What file formats can you import data from in BigQuery?
CSV JSON Avro Parquet PRC Cloud Datastore Backup
How do you export BigQuery data from the command line?
bq extract –destination_format [FORMAT] –compression [COMPRESSION] –field_delimiter [DELIMITER] –print_header {BOOLEAN] [PROJECT ID]:[DATASET].[TABLE] gs://[BUCKET]/FILENAME
How do you import data into BigQuery from the command line?
bq load –autodetect –source_format=[FORMAT[ [DATASET].[TABLE] [PATH to SOURCE]