Quizlet #4 Flashcards
Data warehouse
Assemble data from multiple sources including databases. Built to enable rapid analysis of large and multi-dimensional datasets. Central hub for all business data
BigQuery
BigQuery is serverless or resources, such as compute power, are automatically provisioned behind the scenes as needed to run your queries. So businesses do not pay for compute power unless they are actually running a query
Pub/Sub and DataFlow
Work together to bring unstructured data into the cloud and transform it into semi-structured data
Data lake
Repository for raw data and tends to serve many purposes. Sometimes hold ‘back-up’ data, which helps businesses build resilience against unexpected harm affecting their data. Also hold historic data and not relevant to day-to-day business operations
Pub/Sub
Service for real time ingestion of data
Data Flow
Service for large scale processing of data
Cloud storage benefits
Any amount of data, low latency, accessible from anywhere
Cloud Storage
Multi-regional storage ideal for serving content to users worldwide
Regional Storage
Offered by Cloud Storage is ideal when an organization wants to use the data locally; it gives added throughput and performance by storing data in the same region as your compute infrastructure
Looker
Business intelligence solution that sits on top of any analytics database and makes it simple to describe your data and define business metrics
Artificial intelligence
Term that describes any kind of machine capable of acting autonomously
Machine Learning
Use standard algorithms or standard models to analyze data in order to derive predictive insights and make repeated decisions at scale
Data cleanliness
Cleaning of the data to prevent the model from making accurate predictions or understanding data behavior needs to be cleaned. Also referred to data consistency.
Data completeness
Availability of sufficient data about the world to replace human knowledge
Qualities of good data
Coverage, clean, complete