2_Storage and Databases Flashcards
Cloud Storage
- Bucket is a logical container for objects
- Buckets exist within projects
- Bucket names exist within a global namespace
- Buckets can be:
- Regional
- Dual-regional
- Multi-regional
- Storage class:
- Standard: Frequently accessed data (more than once every 30 days)
- Nearline: Data accessed less frequently (more than 30 days apart)
- Coldline: Archive storage or compliance (accessed less than once a year)
- Archive
Cloud Storage (cont.)
- Costs:
- Operations charges
- Network charges
- Data retrieval charges
- Life Cycle Management
- Apply a life cycle configuration to a bucket
- GCS periodically checks configuration
- Matching rules applied to objects
- Security and access control
- IAM for bulk access to buckets
- ACL for granular access to buckets
- Signed URLs
- Signed policy documents
Cloud SQL
- Direct lift and shift of traditional MySQL/PostgreSQL workloads with the maintenance stack managed for you.
- Standard SLA is 99.95%
- What is managed:
- OS installation/management
- Database installation/management
- Backups
- Scaling - disk space
- Availability:
- Failover
- Read Replicas
- Monitoring
- Authorize network connections/proxy/use SSL
- Limitations
- Read Replicas limited to the same region as the master
- Up to 416Gb RAM and 30Tb storage
- Unsupported features:
- User defined functions
- InnoDB memcached plugin
- Federated engine
- SUPER privilege
Cloud SQL (cont.)
Importing data into Cloud SQL
- Cloud Storage as a staging ground
- SQL dump/CSV file format
- External replica promotion (only for MySQL)
- Binary log retention
Export/Import process
- Export SQL dump/CSV file:
- Export SQL dump file cannot contain triggers, views, stored procedures
- Get dump/CSV file into Cloud Storage
- Import from Cloud Storage into Cloud SQL instance
Best practices
- Use correct flags for dump file (–‘flag_name’)
- Databases, hex-blob, skip-triggers, set-gtid-purged=OFF, ignore-table
- Compress data to reduce costs
- Cloud SQL can import compressed .gz files
- Use InnoDB for Second Generation instances
- Specify agreeable maintenance window
- Scheduled maintenance is not considered a failover event by GCP
Cloud SQL High Availability
The HA configuration, sometimes called a cluster, provides data redundancy. A Cloud SQL instance configured for HA is also called a regional instance and is located in a primary and secondary zone within the configured region. Within a regional instance, the configuration is made up of a primary instance and a standby instance. Through synchronous replication to each zone’s persistent disk, all writes made to the primary instance are also made to the standby instance. In the event of an instance or zone failure, this configuration reduces downtime, and your data continues to be available to client applications.
Note: The standby instance cannot be used for read queries. This differs from the Cloud SQL for MySQL legacy HA configuration.
Cloud Firestore
- Fully managed NoSQL database
- Serverless autoscaling NoSQL document store. Integrated with GCP and Firebase.
- Realtime DB with mobile SDKs
- Android and iOS client libraries, frameworks for all popular programming languages
- Scalability and Consistency
- Horizontal autoscaling and strong consistency, with support for ACID transactions.
- Single Firestore database per project
- Multi-regional for wide access, single region for lower latency and for single location
Cloud Firestore - Usage
- Use Firestore for:
- Applications that need highly available structured data, at scale
- Product catalogs - real-time inventory
- User profiles - mobile apps
- Game save states
- ACID transactions - e.g. transferring funds between accounts
- Supports JSON and SQL-like queries but cannot easily ingest CSV files.
- Operates on multiple keys
- Do not use Firestore for:
- Analytics
- Use BigQuery/Cloud Spanner
- Extreme scale (10M+ read/writes per second)
- Use Bigtable
- Don’t need ACID transactions/data not highly structured
- Use Bigtable
- Lift and Shift (existing MySQL)
- Use Cloud SQL
- Near zero latency (sub 10ms)
- Use in-memory database (Redis)/Memorystore
- Analytics
Cloud Firestore - IAM
- Primitive and predefined
- Owner, user, viewer, import/export admin, index admin
Cloud Firestore - Data Model
- Document Store (think MongoDB)
- Documents (JSON data) are grouped by Collections
- Documents can be hierarchical
- Documents can contain sub-collections
- Each document/entity has one or more properties
- Properties have a value assigned
Cloud Spanner
- Managed SQL-Compliant DB: SQL (ANSI 2011) schemas and queries with ACID transactions
- Horizontally scalable: strong consistency across rows, regions from 1 to 1000 of nodes.
- Highly available: automatic global replication, no planned downtime and 99.99% SLA.
- Similar architecture to Bigtable
- Used for mission critical, relational databases that need strong transactional consistency (ACID compliant)
- Higher workloads than Cloud SQL can support
- Standard SQL format (ANSI 2011)
- Regional or Multi-regional instances
- CPU utilization is the recommended metric for scaling
- Supports secondary indexes
Cloud Spanner vs Cloud SQL
- Cloud SQL = Cloud incarnation of on-premises MySQL database
- Spanner = designed from the ground up for the cloud
- Spanner is not a ‘drop in’ replacement for MySQL
- Not MySQL/PostGreSQL compatible
- Work required to migrate
- However, when making transition, you don’t need to choose between consistency and availability.
Cloud Spanner - IAM
- Project, Instance or Database level
- roles/spanner.****
- Admin: Full access to all spanner resources
- Database Admin: Create/edit/delete databases, grant access to databases
- Database Reader: Read/execute database/schema
-
Viewer: View instances and databases
- Cannot modify or read from database
Cloud Spanner - Data Model
Tables are handled differently
- Parent-child table relationship
- Interleave Data layout
Cloud Memorystore
- Fully managed Redis instance
- Provisioning, replication and failover are fully automated
- No need to provision VMs
- Scale instances with minimal impact
- Automatic replication and failover
- Basic tier
- Efficient cache that can withstand a cold restart and full data flush
- Standard tier
- Adds cross-zone replication and automatic failover
Comparing Storage Options