2_Storage and Databases Flashcards

1
Q

Cloud Storage

  • Bucket is a logical container for objects
  • Buckets exist within projects
  • Bucket names exist within a global namespace
  • Buckets can be:
    • Regional
    • Dual-regional
    • Multi-regional
  • Storage class:
    • Standard: Frequently accessed data (more than once every 30 days)
    • Nearline: Data accessed less frequently (more than 30 days apart)
    • Coldline: Archive storage or compliance (accessed less than once a year)
    • Archive
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Cloud Storage (cont.)

  • Costs:
    • Operations charges
    • Network charges
    • Data retrieval charges
  • Life Cycle Management
    • Apply a life cycle configuration to a bucket
    • GCS periodically checks configuration
    • Matching rules applied to objects
  • Security and access control
    • IAM for bulk access to buckets
    • ACL for granular access to buckets
    • Signed URLs
    • Signed policy documents
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Cloud SQL

  • Direct lift and shift of traditional MySQL/PostgreSQL workloads with the maintenance stack managed for you.
  • Standard SLA is 99.95%
  • What is managed:
    • OS installation/management
    • Database installation/management
    • Backups
    • Scaling - disk space
    • Availability:
      • Failover
      • Read Replicas
    • Monitoring
    • Authorize network connections/proxy/use SSL
  • Limitations
    • Read Replicas limited to the same region as the master
    • Up to 416Gb RAM and 30Tb storage
    • Unsupported features:
      • User defined functions
      • InnoDB memcached plugin
      • Federated engine
      • SUPER privilege
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Cloud SQL (cont.)

Importing data into Cloud SQL

  • Cloud Storage as a staging ground
  • SQL dump/CSV file format
  • External replica promotion (only for MySQL)
    • Binary log retention

Export/Import process

  • Export SQL dump/CSV file:
    • Export SQL dump file cannot contain triggers, views, stored procedures
  • Get dump/CSV file into Cloud Storage
  • Import from Cloud Storage into Cloud SQL instance

Best practices

  • Use correct flags for dump file (–‘flag_name’)
    • Databases, hex-blob, skip-triggers, set-gtid-purged=OFF, ignore-table
  • Compress data to reduce costs
    • Cloud SQL can import compressed .gz files
  • Use InnoDB for Second Generation instances
  • Specify agreeable maintenance window
    • Scheduled maintenance is not considered a failover event by GCP
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Cloud SQL High Availability

The HA configuration, sometimes called a cluster, provides data redundancy. A Cloud SQL instance configured for HA is also called a regional instance and is located in a primary and secondary zone within the configured region. Within a regional instance, the configuration is made up of a primary instance and a standby instance. Through synchronous replication to each zone’s persistent disk, all writes made to the primary instance are also made to the standby instance. In the event of an instance or zone failure, this configuration reduces downtime, and your data continues to be available to client applications.

Note: The standby instance cannot be used for read queries. This differs from the Cloud SQL for MySQL legacy HA configuration.

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Cloud Firestore

  • Fully managed NoSQL database
    • Serverless autoscaling NoSQL document store. Integrated with GCP and Firebase.
  • Realtime DB with mobile SDKs
    • Android and iOS client libraries, frameworks for all popular programming languages
  • Scalability and Consistency
    • Horizontal autoscaling and strong consistency, with support for ACID transactions.
  • Single Firestore database per project
  • Multi-regional for wide access, single region for lower latency and for single location
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Cloud Firestore - Usage

  • Use Firestore for:
    • Applications that need highly available structured data, at scale
    • Product catalogs - real-time inventory
    • User profiles - mobile apps
    • Game save states
    • ACID transactions - e.g. transferring funds between accounts
  • Supports JSON and SQL-like queries but cannot easily ingest CSV files.
  • Operates on multiple keys
  • Do not use Firestore for:
    • Analytics
      • Use BigQuery/Cloud Spanner
    • Extreme scale (10M+ read/writes per second)
      • Use Bigtable
    • Don’t need ACID transactions/data not highly structured
      • Use Bigtable
    • Lift and Shift (existing MySQL)
      • Use Cloud SQL
    • Near zero latency (sub 10ms)
      • Use in-memory database (Redis)/Memorystore
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Cloud Firestore - IAM

  • Primitive and predefined
  • Owner, user, viewer, import/export admin, index admin
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Cloud Firestore - Data Model

  • Document Store (think MongoDB)
  • Documents (JSON data) are grouped by Collections
  • Documents can be hierarchical
    • Documents can contain sub-collections
  • Each document/entity has one or more properties
  • Properties have a value assigned
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Cloud Spanner

  • Managed SQL-Compliant DB: SQL (ANSI 2011) schemas and queries with ACID transactions
  • Horizontally scalable: strong consistency across rows, regions from 1 to 1000 of nodes.
  • Highly available: automatic global replication, no planned downtime and 99.99% SLA.
  • Similar architecture to Bigtable
  • Used for mission critical, relational databases that need strong transactional consistency (ACID compliant)
  • Higher workloads than Cloud SQL can support
  • Standard SQL format (ANSI 2011)
  • Regional or Multi-regional instances
  • CPU utilization is the recommended metric for scaling
  • Supports secondary indexes
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Cloud Spanner vs Cloud SQL

  • Cloud SQL = Cloud incarnation of on-premises MySQL database
  • Spanner = designed from the ground up for the cloud
  • Spanner is not a ‘drop in’ replacement for MySQL
    • Not MySQL/PostGreSQL compatible
    • Work required to migrate
    • However, when making transition, you don’t need to choose between consistency and availability.
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Cloud Spanner - IAM

  • Project, Instance or Database level
  • roles/spanner.****
  • Admin: Full access to all spanner resources
  • Database Admin: Create/edit/delete databases, grant access to databases
  • Database Reader: Read/execute database/schema
  • Viewer: View instances and databases
    • Cannot modify or read from database
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Cloud Spanner - Data Model

Tables are handled differently

  • Parent-child table relationship
  • Interleave Data layout

https://cloud.google.com/spanner/docs/concepts

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Cloud Memorystore

  • Fully managed Redis instance
    • Provisioning, replication and failover are fully automated
    • No need to provision VMs
    • Scale instances with minimal impact
    • Automatic replication and failover
  • Basic tier
    • Efficient cache that can withstand a cold restart and full data flush
  • Standard tier
    • Adds cross-zone replication and automatic failover
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Comparing Storage Options

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Storage Transfer Service

  • Move or backup data to a Cloud Storage bucket either from other cloud storage providers or from your on-premises storage.
  • Move data from one Cloud Storage bucket to another, so that it is available to different groups of users or applications.
  • Periodically move data as part of a data processing pipeline or analytical workflow.

Other Google Cloud transfer options include:

  • Transfer Appliance for moving offline data, large data sets, or data from a source with limited bandwidth
    • Used for on-premises transfers, not cloud-to-cloud, and is not used for repeated/scheduled transfers
  • BigQuery Data Transfer Service to move data from SaaS applications to BigQuery.
  • Transfer service for on-premises data to move data from your on-premises machines to Cloud Storage
A
17
Q

Options for transferring storage

Storage Transfer Service: Moving large amounts of data is seldom as straightforward as issuing a single command. You have to deal with issues such as scheduling periodic data transfers, synchronizing files between source and sink, or moving files selectively based on filters. Storage Transfer Service provides a robust mechanism to accomplish these tasks.

  • Use when transferring data from another cloud storage provider OR your private data center to Google Cloud for more than 1 TB of data

gsutil: For one-time or manually initiated transfers, you might consider using gsutil, which is an open source command-line tool that is available for Windows, Linux, and Mac. It supports multi-threaded transfers, processed transfers, parallel composite uploads, retries, and resumability.

  • Use when transferring data from your private data center to Google Cloud for more less 1 TB of data

Transfer Appliance: Depending on your network bandwidth, if you want to migrate large volumes of data to the cloud for analysis, you might find it less time consuming to perform the migration offline by using the Transfer Appliance.

  • Use when transferring data from your private data center to Google Cloud with not enough bandwidth to meet your project deadline (or typically > 50 TB)
A