google_cloud_ace_topics_20221211172730 Flashcards
3 types of Storage Systems
Cache, Persistent, Object
Example of cache in GC
Memory Store - managed Redis service
1 benefit of cache
low latency - sub-millisecond access
3 problems of cache
- volatile - lost when machine shuts down2. more expensive than SSD or HDD3. can get out of sync with the system of truth (persistent storage)
Cache - quick definition
in memory data store
Persistent Storage - quick definition
durable block storage
Where can you use persistent storage?
Can be attached to VMs in Compute Engine and Kubernetes Engine.
Where is persistent storage located?
On the network. They are not attached to the physical servers hosting your VM. They exist independently of VMs.
Can a VM have locally attached persistent storage?
Yes. VM can have local SSD, but it is volatile.
Types of persistent storage
SSD and HDD
Differences between SSD and HDD
HDD have higher latency, but lower cost.Network attached SSD are 40/20 times faster (R/W) than HDD. Locally attached SSD are 200/150 times faster.
Max size of SSD/HDD
64TB
4 facts of persistent storage
- can create file systems on them2. data is automatically encrypted3. size can be increased while mounted to VM4. can be mounted in read-only mode on multiple VMs at once
Is persistent storage zonal or regional?
Both. Regional replicates data across different zones, but is more expensive than purely zonal storage.
What is Object Storage good for?
large volumes of data that is shared widely
3 storage data models
- object2. relational3. NoSQL
GC app example of object model storage
Cloud Store
How are object model objects stored?
Atomically. Cannot read parts of an object. Must copy to server, make changes and then copy object back to object storage system. Used when you don’t need fine-grained access to data within the object while it is in the object store
3 GC app examples of Relational model storage
- Cloud SQL2. Cloud Spanner3. Big Query
3 facts of Relational model storage
- supports frequent queries and updates to data2. allows for consistent view of data3. supports database transactions (Cloud SQL and Cloud Spanner)
3 GC app examples of NoSQL model storage
- Cloud Datastore2. Cloud Firestore3. Bigtable
Benefits/Limitations of 3 types of Storage systems.
- cache is fastest but most expensive and volatile.2. persistent is used for things that need block storage. SSD are faster but more expensive.3. object storage used for large volumes of data for long periods of time.
5 things to consider when planning object storage
- frequency of read/write2. consistency3. transaction support4. cost5. latency
Planning storage - frequency of read/writeBest for structured data that is frequently accessed
Cloud SQL
Planning storage - frequency of read/writeBest for global database that supports relational read/writes
Cloud Spanner
Planning storage - frequency of read/writeBest for writing data at high rates and in large volumes
Bigtable
Planning storage - frequency of read/writeBest for writing files and downloading them in their entirety
Cloud Store
Planning Storage - ConsistencyStrongest consistency (2)
Cloud SQLCloud Spanner
Planning Storage - ConsistencyGood for unstructured data
Datastore
Planning Storage - Transaction Support3 apps that support transactions
Cloud SQLCloud SpannerDatastore
Planning Storage - LatencyFastest
Bigtable
Planning Storage - LatencyGlobally consistent and scalable
Cloud Spanner
Cloud Functions - Definition
Serverless computing platform designed to run single-purpose pieces of code in response to events in GCP environments (PaaS)
How are Cloud Functions managed? (3 key points)
- functions execute in a secure isolated environment2. since they each run in a separate instance, they don’t share memory, so they need to be stateless3. multiple instances may be running at once.
How long can a Cloud Function run?
default timeout is 1 min, but can be configured up to 9 min
What languages do Cloud Functions support?
Node.js 8Node.js 10Python 3.7Go 1.1
Cloud Functions - key points (3)
- managed independently from other services2. short running code3. fully managed - serverless
How do cloud functions work?
Events have triggers which executes a function in response to the event.
Examples of Cloud Function events (5)
- HTTP Request2. Cloud Storage event - adding, deleting, etc… a file3. Cloud Pub/Sub event - publishing a message4. Firebase - database trigger5. Stackdriver logging
Cloud Function functions - key points (3)
- run in a separate instance every time they are invoked2. no way to share data without using external service3. the function is passed arguments about the event
Cloud Functions use case examples (4)
- webhooks - respond to an http request2. image processing - validate or transform images3. mobile back end - react to storage, authentication or data event4. IOT - react to pub/sub from devices
What needs to be filled in when creating a cloud function via the cloud console?
- name2. memory allocation - 128MB to 2GB3. trigger4. event type - depends on trigger5. source of function code - editor , zip file , upload, etc…6. runtime - node, python or go
Create cloud function via shell - main command
gcloud functions deploy [NAME]
Cloud function shell parameters (3)
–trigger-resource or –trigger-topic–trigger-event
Command to delete a cloud function via shell?
gcloud functions delete [NAME]
BigQuery description
petabyte scale analytics database service for data warehousing
BigQuery key points (9)
- serverless2. uses standard sql queries3. near real-time interactive analysis of massive data sets4. can access info stored in Cloud Storage, Cloud SQL, Bigtable and Google Drive5. Storage and computing are handled and billed separately6. Automatic data replication7. Can modify data with DLL8. Can query public or commercial data sets.9. High availability
What are 3 BigQuery use cases?
- Real-time Inventory2. Predictive Marketing3. Analytical Events
What Google apps can BigQuery access?
- Cloud Storage2. Cloud SQL3. Cloud Bigtable4. Google Drive
How do you estimate the cost of a BigQuery query via shell?
You run the query with a flag of –dry_run setbq –location=[LOCATION] query –use_legacy_sql=false –dry_run [SQL QUERY]
What are jobs in BigQuery?
Processes used to load, export copy and query data. Jobs are automatically started when you start one of these processes.
How do you view the status of a BigQuery job? (Shell and console)
Console - click job history from BQ consoleShell - bq –location=[LOCATION] show -j [JOB ID]
How do you export BQ data from the console?
go to BigQuery -> Resources, open the dataset containing the table to be exported and select the table. Export options are on upper right
Where can you export BQ data to?
Cloud Storage or Data Studio (a GCP analysis tool)
How do you import BQ data via console?
go to BiqQuery -> Resources and select a dataset to import into. Click create table tab. Select a source, file format (if source is not empty table)table type (external or native - if external, data is kept in source location and only metdata about the table is stored in BigQuerytable name
What file formats can you import data from in BigQuery?
CSVJSONAvroParquetPRCCloud Datastore Backup
How do you export BigQuery data from the command line?
bq extract –destination_format [FORMAT] –compression [COMPRESSION] –field_delimiter [DELIMITER] –print_header {BOOLEAN] [PROJECT ID]:[DATASET].[TABLE] gs://[BUCKET]/FILENAME
How do you import data into BigQuery from the command line?
bq load –autodetect –source_format=[FORMAT[ [DATASET].[TABLE] [PATH to SOURCE]
Cloud Dataflow description
Fully managed service for creating data (batch and stream) processing pipelines where data is collected, transformed and then output
What are the key features of Cloud Dataflow? (7)
- Based on Apache Beam2. Process data on multiple machines in parallel.3. Handles streaming data like Cloud Pub/Sub4. Handles batch or archived data like Cloud BigQuery5. Serverless6. Templates for ease of replication7. Best choice if not using Apache Hadoop or Spark
Where does Cloud Dataflow deliver its output?
BigQuery, Cloud Machine Learning, Cloud Bigtable
3 examples of Cloud Dataflow
- Analytical dashboards2. Forecasting Sales Trends3. ETL
Cloud SQL definition
Managed database service that provides MySQL, PostgreSQL and SQL Server databases
Key points of CloudSQL (4)
- Allows users to set up database without all the database administration tasks2. High Availability - manages replication and allows for automatic failover3. Suited for applications with consistent data structure (for databases that don’t need to scale horizontally)4.Scales vertically (by running on servers with more memory and CPUs)
What databases and versions does CloudSQL support?
- MySQL 5.6/5.7 up to 416GB RAM and 30TB data storage2. PostgreSQL up to 416GB RAM and 64 CPUs and 30TB storage3. SQL Server 416 GB RAM 64 CPUs 30 TB storage
How do you connect to Cloud SQL via shell?
gcloud sql connect [INSTANCE NAME] -user=[USERNAME]
How do you backup (on demand) Cloud SQL via shell?
gcloud sql backups create –async –instance [INSTANCE NAME]
How do you schedule automatic backup on Cloud SQL via shell?
gcloud sql instances patch [INSTANCE NAME] -backup-start-top [HH:MM]
Where is backup data stored for Cloud SQL?
In a bucket in Cloud Store.
How do you export Cloud SQL data via shell?
gcloud sql export [TYPE] {INSTANCE NAME] gs://[BUCKET]/[FILE NAME] –database=[DATABASE NAME]You need to make sure that the service account can write to the bucket.
How do you import Cloud SQL data via shell?
gcloud sql import [TYPE] {INSTANCE NAME] gs://[BUCKET]/[FILE NAME] –database=[DATABASE NAME]
Cloud Bigtable description
a petabyte-scale fully managed NoSQL database service
Key points of Cloud Bigtable (6)
- can manage billions of rows and thousands of columns - not all rows need to use all columns2. low-millisecond latency - can support millions of operations per second3. based on NoSQL wide-column data model, not document database4. Supports Hbase API for Hadoop5. Integrates with open source tools for data processing, graph analysis and time-series analysis6. runs in clusters and scales horizontally
What are 3 usages for Cloud Bigtable?
applications with high data volume and high velocity ingest of data1. time series2. iot3. financial applications
How do you import or export data to/from Bigtable?
There are no options via console or shell. You need to use a java application or use HBase interface to execute HBase commands
Cloud Spanner description
Globally distributed relational database
Key points of Cloud Spanner (6)
- combines benefits of relational database with NoSQL database - strong consistency, transactions and horizontal scaling.2. High availability3. Enterprise-grade security with encryption at rest and in transit4. ANSI 2011 standard SQL5. Much more expensive than other databases6. Regional or multi-regional
When is Cloud Spanner used?
when there are extremely large volumes of relational data or data that needs to be globally distributed while ensuring consistency and transaction integrity across all servers
2 examples of where to use Cloud Spanner
- Global supply chains.2. Financial services applications
What is important to know about importing or exporting data to/from Cloud Spanner? (2)
- The import/export will incur Cloud Dataflow charges.2. There may be additional charges if the region the job is run in does not overlap the region in which the instance resides. 3. You cannot import/export via shell
Cloud Datastore decsription
Highly-scalable NoSQL managed document database
Key points of Cloud Datastore (5)
- Managed service - serverless2. Document database3. Accessed via REST API in Compute Engine, Kubernetes Engine or App Engine4. Automatically partitions data and scales up or down as needed5. Supports transactions, indexes and SQL-like queries (using GQL)
What is Cloud Datastore suited for?
Applications that demand high scalability, structured data and don’t need strong consistency
Key points of a document database (6)
- Does not use relational model and does not require fixed structure or schema2. Data organized into documents3. Documents are made up of key-value pairs called entities4. Entities do not need to have the same set of properties5. Allows for a flexible schema6. Does not support relational operations like joining tables or computing aggregates7. Kind is analogous to a table name
How do you import/export Cloud Datastore data?
Done via shell only and data is stored in a bucket in Cloud Storage
How do you export Cloud Datastore data via shell, and what files does it create?
gcloud datastore export –namespaces=”(default)” gs://[BUCKET]1. it will create a folder using the date and time of the export2. the folder will contain a metadata file and a folder containing the exported data3. the metadata file is used when importing that data
What permission does someone doing a Cloud Datastore export need?
datastore.database.export
What is Cloud Memorystore?
in memory cache service (managed Redis Service)
What are some key points of Cloud Memorystore ? (5)
- Managed Redis service for caching frequently used data2. Sub-millisecond access3. Can be configured for high availability4. Can be used with Compute Engine, App Engine and Kubernetes Engine5. 1GB to 300 GB of memory
What is Cloud Firestore?
Managed NoSQL database service designed for highly scalable web and mobile apps.
What are some key points of Cloud Firestore? (3)
- Uses the document data model2. Designed for storing, synchronizing and querying data across distributed applications like mobile apps.3. Supports transactions and provides multi-regional replication
What is Cloud Filestore?
Shared file system for use with Compute Engine and Kubernetes Engine
What are some key points of Cloud Filestore? (4)
- Based on NFS2. Suitable for applications that require operating system-like file access3. exists independently of the VMs or applications that access those files4. can support a high number of IO operations per second5. variable storage capacity
What is Cloud Armor?
It delivers defense at scale against infrastructure and application DDoS attacks.
What are the key points of Cloud Armor? (5)
- allow or restrict access based on IP2. predefined rules to counter cross-site scripting attacks3. counter SQL injection4. restrict access based on geolocation of incoming traffic5. define rules at level 3 (network) and level 7 (application)
What is Cloud CDN?
a Content Delivery Network - allows low latency response by caching content on a number of servers around the world.
What is Cloud Interconnect?
a service for connecting existing networks to GCP
What are three key points of Cloud Interconnect?
- traffic between your on-premise network and your VPC doesn’t traverse the public internet2. two options dedicated and partnered3. standard Google VPN services can be used if you don’t mind using the public internet
What are the two options for Cloud Interconnect, and what are they different?
- Dedicated - direct accessa direct connection is maintained between an on-premise or hosted data center and a Google colocation facility2. Partnered - peereda third party network provider provides connectivity between company’s data center and google.
What is Cloud SDK?
a command line interface for managing GCP resources.
What client libraries exist for Cloud SDK?
Java, Python, Node.js, Ruby, Go, .NET and PHP
What is Cloud Trace?
a distributed tracing system for collecting latency data from an application
Key points of Cloud Trace?
- shows where applications are spending their time (bottlenecks)2. traces are generated when Cloud Trace is called from an application3. you can create reports that filter trace data according to report criteria
What is Cloud Status?
provides status information on the services that are a part of GCP The dashboard lists services and uses icons to display their statuses
What is Cloud AutoML?
allows a developer with no experience to develop machine learning tools
What is Cloud Machine Learning Engine?
for building and deploying scalable machine learning systems
What is Cloud Natural Language Processing?
for analyzing human language and extracting information from text
What is Cloud Vision?
an image analysis platform
Billing account key points? (5)
- store info on how to pay for resources used.2. associated with one or more projects3. all projects must have a billing account associated with it.4. can have similar structure to resource hierarchy5. can be exported to BigQuery or Cloud Storage file (CSV or JSON)
What are the two types of billing accounts?
- self-service : paid by debit, credit or bank account automatically2. invoiced - invoices sent to customers
What are the 4 roles associated with billing accounts and what are their permissions?
- Billing Account Creator - can create new self-service billing accounts2. Billing Account Admin - manages billing accounts but cannot create them3. Billing Account User - allows user to link projects to a billing account4. Billing Account Viewer - view billing account cost and transactions
Billing budgets and alerts - key points (5)
- you can be sent a notice when a certain percentage of your budget has been spent in a month2. that amount can be a set amount or based on the previous month’s amount3. the three default percentages are 50%, 90% and 100%, but you can add more.4. alert will be sent via email, but can also be sent to Cloud Pub/Sub5. Since more than one project can be associated with a billing account, the alert amount needs to take into account the amount spent on all projects in the account
Block Storage key points (6)
- uses fixed size blocks (4kb and up)2. available on disks attached to a VM3. persistent - exists independently of VM4. ephemeral - exists only while VM is running5. faster than object storage6. used by File system and databases
What is Cloud Dataprep?
allows exploration and preparation of data for analysus
How are gcloud commands formatted?
- start with a group to indicate a resource (e.g compute)2. followed by a subgroup to indicate what type of group resource you are working with (e.g instances)3. after a subgroup, usually a verb and then parametersgcloud compute instances create [instance name] –zone us-central1a