2_Storage and Databases Flashcards by Julien Heck

Cloud Storage

Bucket is a logical container for objects
Buckets exist within projects
Bucket names exist within a global namespace
Buckets can be:
- Regional
- Dual-regional
- Multi-regional
Storage class:
- Standard: Frequently accessed data (more than once every 30 days)
- Nearline: Data accessed less frequently (more than 30 days apart)
- Coldline: Archive storage or compliance (accessed less than once a year)
- Archive

How well did you know this?

Not at all

Perfectly

Cloud Storage (cont.)

Costs:
- Operations charges
- Network charges
- Data retrieval charges
Life Cycle Management
- Apply a life cycle configuration to a bucket
- GCS periodically checks configuration
- Matching rules applied to objects
Security and access control
- IAM for bulk access to buckets
- ACL for granular access to buckets
- Signed URLs
- Signed policy documents

How well did you know this?

Not at all

Perfectly

Cloud SQL

Direct lift and shift of traditional MySQL/PostgreSQL workloads with the maintenance stack managed for you.
Standard SLA is 99.95%
What is managed:
- OS installation/management
- Database installation/management
- Backups
- Scaling - disk space
- Availability:
  - Failover
  - Read Replicas
- Monitoring
- Authorize network connections/proxy/use SSL
Limitations
- Read Replicas limited to the same region as the master
- Up to 416Gb RAM and 30Tb storage
- Unsupported features:
  - User defined functions
  - InnoDB memcached plugin
  - Federated engine
  - SUPER privilege

How well did you know this?

Not at all

Perfectly

Cloud SQL (cont.)

Importing data into Cloud SQL

Cloud Storage as a staging ground
SQL dump/CSV file format
External replica promotion (only for MySQL)
- Binary log retention

Export/Import process

Export SQL dump/CSV file:
- Export SQL dump file cannot contain triggers, views, stored procedures
Get dump/CSV file into Cloud Storage
Import from Cloud Storage into Cloud SQL instance

Best practices

Use correct flags for dump file (–‘flag_name’)
- Databases, hex-blob, skip-triggers, set-gtid-purged=OFF, ignore-table
Compress data to reduce costs
- Cloud SQL can import compressed .gz files
Use InnoDB for Second Generation instances
Specify agreeable maintenance window
- Scheduled maintenance is not considered a failover event by GCP

How well did you know this?

Not at all

Perfectly

Cloud SQL High Availability

The HA configuration, sometimes called a cluster, provides data redundancy. A Cloud SQL instance configured for HA is also called a regional instance and is located in a primary and secondary zone within the configured region. Within a regional instance, the configuration is made up of a primary instance and a standby instance. Through synchronous replication to each zone’s persistent disk, all writes made to the primary instance are also made to the standby instance. In the event of an instance or zone failure, this configuration reduces downtime, and your data continues to be available to client applications.

Note: The standby instance cannot be used for read queries. This differs from the Cloud SQL for MySQL legacy HA configuration.

How well did you know this?

Not at all

Perfectly

Cloud Firestore

Fully managed NoSQL database
- Serverless autoscaling NoSQL document store. Integrated with GCP and Firebase.
Realtime DB with mobile SDKs
- Android and iOS client libraries, frameworks for all popular programming languages
Scalability and Consistency
- Horizontal autoscaling and strong consistency, with support for ACID transactions.
Single Firestore database per project
Multi-regional for wide access, single region for lower latency and for single location

How well did you know this?

Not at all

Perfectly

Cloud Firestore - Usage

Use Firestore for:
- Applications that need highly available structured data, at scale
- Product catalogs - real-time inventory
- User profiles - mobile apps
- Game save states
- ACID transactions - e.g. transferring funds between accounts
Supports JSON and SQL-like queries but cannot easily ingest CSV files.
Operates on multiple keys
Do not use Firestore for:
- Analytics
  - Use BigQuery/Cloud Spanner
- Extreme scale (10M+ read/writes per second)
  - Use Bigtable
- Don’t need ACID transactions/data not highly structured
  - Use Bigtable
- Lift and Shift (existing MySQL)
  - Use Cloud SQL
- Near zero latency (sub 10ms)
  - Use in-memory database (Redis)/Memorystore

How well did you know this?

Not at all

Perfectly

Cloud Firestore - IAM

Primitive and predefined
Owner, user, viewer, import/export admin, index admin

How well did you know this?

Not at all

Perfectly

Cloud Firestore - Data Model

Document Store (think MongoDB)
Documents (JSON data) are grouped by Collections
Documents can be hierarchical
- Documents can contain sub-collections
Each document/entity has one or more properties
Properties have a value assigned

How well did you know this?

Not at all

Perfectly

Cloud Spanner

Managed SQL-Compliant DB: SQL (ANSI 2011) schemas and queries with ACID transactions
Horizontally scalable: strong consistency across rows, regions from 1 to 1000 of nodes.
Highly available: automatic global replication, no planned downtime and 99.99% SLA.
Similar architecture to Bigtable
Used for mission critical, relational databases that need strong transactional consistency (ACID compliant)
Higher workloads than Cloud SQL can support
Standard SQL format (ANSI 2011)
Regional or Multi-regional instances
CPU utilization is the recommended metric for scaling
Supports secondary indexes

How well did you know this?

Not at all

Perfectly

Cloud Spanner vs Cloud SQL

Cloud SQL = Cloud incarnation of on-premises MySQL database
Spanner = designed from the ground up for the cloud
Spanner is not a ‘drop in’ replacement for MySQL
- Not MySQL/PostGreSQL compatible
- Work required to migrate
- However, when making transition, you don’t need to choose between consistency and availability.

How well did you know this?

Not at all

Perfectly

Cloud Spanner - IAM

Project, Instance or Database level
roles/spanner.****
Admin: Full access to all spanner resources
Database Admin: Create/edit/delete databases, grant access to databases
Database Reader: Read/execute database/schema
Viewer: View instances and databases
- Cannot modify or read from database

How well did you know this?

Not at all

Perfectly

Cloud Spanner - Data Model

Tables are handled differently

Parent-child table relationship
Interleave Data layout

https://cloud.google.com/spanner/docs/concepts

How well did you know this?

Not at all

Perfectly

Cloud Memorystore

Fully managed Redis instance
- Provisioning, replication and failover are fully automated
- No need to provision VMs
- Scale instances with minimal impact
- Automatic replication and failover
Basic tier
- Efficient cache that can withstand a cold restart and full data flush
Standard tier
- Adds cross-zone replication and automatic failover

How well did you know this?

Not at all

Perfectly

Comparing Storage Options

How well did you know this?

Not at all

Perfectly

Storage Transfer Service

Move or backup data to a Cloud Storage bucket either from other cloud storage providers or from your on-premises storage.
Move data from one Cloud Storage bucket to another, so that it is available to different groups of users or applications.
Periodically move data as part of a data processing pipeline or analytical workflow.

Other Google Cloud transfer options include:

Transfer Appliance for moving offline data, large data sets, or data from a source with limited bandwidth
- Used for on-premises transfers, not cloud-to-cloud, and is not used for repeated/scheduled transfers
BigQuery Data Transfer Service to move data from SaaS applications to BigQuery.
Transfer service for on-premises data to move data from your on-premises machines to Cloud Storage

Study These Flashcards

Options for transferring storage

Storage Transfer Service: Moving large amounts of data is seldom as straightforward as issuing a single command. You have to deal with issues such as scheduling periodic data transfers, synchronizing files between source and sink, or moving files selectively based on filters. Storage Transfer Service provides a robust mechanism to accomplish these tasks.

Use when transferring data from another cloud storage provider OR your private data center to Google Cloud for more than 1 TB of data

gsutil: For one-time or manually initiated transfers, you might consider using gsutil, which is an open source command-line tool that is available for Windows, Linux, and Mac. It supports multi-threaded transfers, processed transfers, parallel composite uploads, retries, and resumability.

Use when transferring data from your private data center to Google Cloud for more less 1 TB of data

Transfer Appliance: Depending on your network bandwidth, if you want to migrate large volumes of data to the cloud for analysis, you might find it less time consuming to perform the migration offline by using the Transfer Appliance.

Use when transferring data from your private data center to Google Cloud with not enough bandwidth to meet your project deadline (or typically > 50 TB)

Study These Flashcards

2_Storage and Databases Flashcards

(17 cards)