Big Data and Machine Learning on Google Cloud Flashcards

1
Q

Describe Google Cloud infrastructure

A

Google Cloud infrastructure in terms of three layers:
1) networking and security
2) compute and storage
3) big data and machine learning products

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What compute GCP services are available?

A
  • Compute Engine
  • Google Kubernetes Engine
  • App Engine
  • Cloud Functions
  • Cloud Run
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is Compute Engine?

A
  • Compute Engine is an IaaS offering, or infrastructure as a service, which provides compute, storage, and network resources virtually that are similar to physical data centers.
  • You use the virtual compute and storage resources the same as you manage them locally.
  • Compute Engine provides maximum flexibility for those who prefer to manage server instances themselves.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is Google Kubernetes Engine?

A
  • Google Kubernetes Engine, or GKE GKE runs containerized applications in a cloud environment, as opposed to on an individual virtual machine, like Compute Engine.
  • A container represents code packaged up with all its dependencies.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is App Engine?

A
  • App Engine, a fully managed PaaS offering, or platform as a service.
  • PaaS offerings bind code to libraries that provide access to the infrastructure application needs. - This allows more resources to be focused on application logic.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are Cloud Functions?

A
  • Cloud Functions execute code in response to events, like when a new file is uploaded to Cloud Storage.
  • It’s a completely serverless execution environment, which means you don’t need to install any software locally to run the code and you are free from provision and managing servers.
  • Cloud Functions is often referred to as functions as a service.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is Cloud Run?

A
  • It is a fully managed compute platform that enables you to run request or event-driven stateless workloads without having to worry about servers.
  • It abstracts away all infrastructure management so you can focus on writing code.
  • It automatically scales up and down from zero, so you never have to worry about scale configuration.
  • Cloud Run charges you only for the resources you use so you never pay for over provisioned resources.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is Cloud Run?

A
  • It is a fully managed compute platform that enables you to run request or event-driven stateless workloads without having to worry about servers.
  • It abstracts away all infrastructure management so you can focus on writing code.
  • It automatically scales up and down from zero, so you never have to worry about scale configuration.
  • Cloud Run charges you only for the resources you use so you never pay for over provisioned resources.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is TPU?

A

TPUs are Google’s custom-developed application-specific integrated circuits (ASICs) used to accelerate machine learning workloads.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How TPU is different from CPU or GPU?

A
  • TPUs act as domain-specific hardware, as opposed to general-purpose hardware with CPUs and GPUs.

-This allows for higher efficiency by tailoring architecture to meet the computation needs in a domain, such as the matrix multiplication in machine learning

  • With TPUs, the computing speed increases more than 200 times.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the major differences between cloud computing and desktop computing?

A

On GCP, for proper scaling capabilities compute and storage are decoupled.
So processing limitations aren’t attached to storage disks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What fully managed database and storage Services are offered?

A
  • cloud storage
  • cloud bigtable
  • cloud SQL
  • cloud spanner
  • firestore
  • bigquery
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Where is it better to store unstructured data?

A

cloud storage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are cloud storage’s primary storage classes?

A
  • standard storage (is considered best for frequently accessed or hot data; it’s also great for data that is stored for only brief periods of time)
  • nearline storage (is best for storing infrequently accessed data like reading or modifying data once per month or less on average)
  • cold Line storage (is also a low-cost option for storing infrequently accessed data that is meant for reading or modifying data at most once every 90 days)
  • archive storage (is the lowest cost option used for data archiving online backup and disaster recovery - once a year)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Describe the road map for storing structural data.

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is BigTable?

A

BigTable provides a scalable nosql solution for analytical workloads.
It’s best for real-time High throughput applications that require only millisecond latency.

17
Q

What products are offered in the ingestion and process category?

A

Products that are used to digest both real-time and batch data.
- Pub/Sub
- Dataflow
- Dataproc
- Cloud Data Fusion

18
Q

What products can can ingest streaming data?

A

Dataflow and Pub/Sub

19
Q

What products are offered in the data storage category?

A
  • Cloud Storage
  • Cloud SQL
  • Cloud Spanner
  • Cloud Bigtable
  • Firestore
20
Q

Which data storage products are for relational databases?

A
  • Cloud SQL
  • Cloud Spanner Bigtable and Firestore are NoSQL databases.
21
Q

Which data storage products are for NoSQL databases?

A
  • Bigtable
  • Firestore
22
Q

What is BigQuiry?

A

BigQuery is a fully managed data warehouse that can be used to analyze data through SQL
commands.

23
Q

What products are offered in the analytics category?

A
  • BigQuery.
  • Data Studio
  • Looker
24
Q

What products are offered in the ML category?

A

ML products include both the ML development platform and the AI solutions: The primary product of the ML development platform is Vertex AI, which includes the products and technologies: AutoML Vertex AI Workbench, and
02:01
TensorFlow AI solutions are built on the ML development platform and include state-of-the-art products to meet both horizontal and vertical market needs. These include: Document AI Contact Center AI Retail Product Discovery, and Healthcare Data Engine