Introduction to Data Analytics on Google Cloud Flashcards
Data Sources are connectors that let you do what with your data?
1) Query the data.
2) Clean the data.
3) Ingest and process the data.
4) Store the data.
The correct answer is: Query the Data
Which Product is a serverless data warehouse for storage and analytics?
BigQuery
Cloud Storage
Cloud Spanner
Looker
The correct answer is: BigQuery
BigQuery is a serverless data warehouse provided by Google Cloud that is designed for large-scale data storage and analytics. It allows you to store massive datasets and run SQL queries without having to manage the infrastructure, making it ideal for analytics at scale.
Which Google Cloud product is a relational database used to establish relationships between information in multiple datatables?
Cloud Spanner
BigQuery
BigTable
Dataproc
The correct answer is: Cloud Spanner
Cloud Spanner is a relational database that enables you to establish relationships between different pieces of information stored across multiple tables, making it ideal for applications that require strong consistency and high scalability. It is designed for both operational and analytical workloads, and it provides features like automatic scaling, global distribution, and high availability.
What are the correct steps in the data analytics lifecycle?
Visualize results and share the data.
Activate, store, and analyze.
Visualize, process, and ingest.
Ingest, process, store, analyze, and activate.
Ingest, process, store, analyze, and activate.
What type of data is used for machine learning?
Structured data only
Relational data
Structured and unstructured data
Raw data
The correct answer is: Structured and unstructured data.
Machine learning models can be trained on both structured and unstructured data, depending on the application. Here’s how they differ:
Structured data refers to data that is organized in a table or a defined schema, such as relational databases or spreadsheets. This data is often used in traditional machine learning tasks like regression, classification, and time-series analysis.
Unstructured data refers to data that doesn’t have a pre-defined format or organization, like text, images, audio, or video. Machine learning models can be trained on unstructured data using techniques like natural language processing (NLP) for text or convolutional neural networks (CNNs) for images.
While raw data (which is typically unprocessed or uncleaned) can be used for machine learning, it usually needs to be cleaned and processed before being fed into a model.
Relational data (often referring to structured data in a relational database) can be a subset of structured data but isn’t the only type of data used in machine learning.
Cloud Storage
This is an object storage service for unstructured data like files, images, and backups, not a data warehouse.
Cloud Spanner
This is a distributed relational database service, not a data warehouse.
Looker
Looker is a business intelligence (BI) and data analytics platform, not a data warehouse. It connects to data warehouses like BigQuery for analytics.
BigQuery
Bigquery is designed specifically for serverless, scalable analytics and storage, making it the right choice here.
What is a Database?
an organized collection of data stored in tables and accessed electronically from a computer system
What is a Relational Database?
A Relational Database is a type of database that stores data in tables with rows and columns, where each table represents a different entity and the relationships between them are defined by keys (e.g., primary keys, foreign keys). It uses Structured Query Language (SQL) for managing and querying the data
What is a non Relational Database?
A Non-Relational Database (also known as NoSQL) is a type of database that stores data in formats other than tables (e.g., key-value pairs, documents, graphs, or wide-column stores). These databases are designed for flexible, scalable storage and can handle large amounts of unstructured or semi-structured data. They don’t require a fixed schema or predefined relationships between data.
Which google Cloud offerings are for Relational Databases?
Cloud SQL, Cloud Spanner, AlloyDB for PostgreSQL
Which google Cloud offerings are for Non Relational Databases?
BigTable
What is a Data Warehouse?
A data warehouse contains structured and organized data, which can be used for advanced querying.