Terms Flashcards
TCO
Total Cost of Ownership
For when business want to move to the cloud, weigh the cost of cloud adoption against the cost of on-prem systems.
On-prem: Consider static resources throughout their lifetime, consider all costs such as power, cooling, maintenance, etc. Then calculate any missed benefits from the cloud.
CapEx vs. OpEx
Capital Expenditures are upfront business expenses put towards fixed assets. It is an item that is bought once. Hardware, cooling systems, etc. Small businesses struggle with CapEx.
OpEx are recurring costs. Website/domain hosting, registrations, or subscriptions. Not considered investments since you don’t own any capital.
Data warehouse
A place to store structured and semi-structured data but, unlike a database, is meant for analyzing. (BigQuery)
Data lake
Stores all kinds of data, used to explore, process, and analyze raw data regardless of source. It stores data in the original format.
Types of business data
First-party: proprietary customer data. Collected through transactions or direct web interactions.
Second-party: first-party data from another organization. Possibly a partner in a supply chain. The second-party organization does not own the data.
Third-party: datasets collected by an organization that don’t directly interact with an organization’s customers or business. Might be government or non-profit data such as demographics or industry benchmarkings.
Data value chain steps
- Data Genesis
- Data Collection
- Data Processing
- Data Storage
- Data Analysis
- Data Activation
Google Structured data storage options
Cloud SQL
Spanner
BigQuery
Cloud SQL
relational database including MySQL, PostgreSQL and SQL server. Doesn’t require any software installation, supports managed backups, has a firewall and encrypts data.
Spanner
Relational database (SQL) that is extremely scalable. Has zero downtime for planned maintenance and schema changes. Globally available.
BigQuery
Data warehouse. Stores petabytes of data and implements machine learning and intelligent analytics. Works in a multicloud environment. Compatible with relational SQL data, good for analytics.
Google Semi-structured data storage
Firestore
Bigtable
Firestore
NoSQL cloud database, accessible by web. Automatic scaling and offline usage available.
Bigtable
NoSQL database for large amounts of data
Google unstructured option
Cloud Storage
DMS
Database Migration Service, can easily migrate databases to google cloud.
Datastream
Synchronize data across databases, storage systems, and applications.
Looker
Google Cloud business intelligence platform designed to for analytics, visuals, and data sharing. Supports BigQuery and 60 different SQL databases. Web-based
Streaming analytics
process of processing and analyzing continuously instead of in batches. Equipment sensors, clickstreams, social media feeds, stock market, etc.
Google Streaming Analytics products
Pub/Sub: Distributed messaging service that can receive messages from various device streams
Dataflow: creates a pipeline to process streaming data with batch data.