Cloud Big Data Platform Flashcards

1
Q

What does serverless mean?

A

you don’t have to worry about provisioning Compute Instances to run your jobs. The services are fully managed, and you pay only for the resources you consume.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is Cloud Dataproc?

A

a fast, easy, managed way to run Hadoop, Spark, Hive, and Pig on Google Cloud Platform. All you have to do is request a Hadoop cluster. It will be built for you in 90 seconds or less, on top of Compute Engine virtual machines whose number and type you control. If you need more or less processing power while your cluster is running, you can scale it up or down.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How can you monitor your Hadoop / Dataproc cluster?

A

Stackdriver

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is a benefit for running Dataproc over on-premise Hadoop?

A

Running on-premises, Hadoop jobs requires a capital hardware investment.

Running these jobs in Cloud Dataproc, allows you to only pay for hardware resources used during the life of the cluster you create

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What tools can you leverage once your data is in Dataproc?

A

you can use Spark and Spark SQL to do data mining. And you can use MLib, which is Apache Spark’s machine learning libraries to discover patterns through machine learning.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is Cloud Dataflow?

A

It’s both a unified programming model and a managed service and it lets you develop and execute a big range of data processing patterns: extract, transform, and load batch computation and continuous computation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Cloud Dataflow is used to build ______.

A

Pipelines

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Bigquery can stream in data a rate of _______

A

100,000 rows per second

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How are you charged for Bigquery?

A

Compute & storage are separate. Pay for the compute only when queries are running and pay for storage separately.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does Pub/Sub stand for?

A

Publishers and Subscribers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How many messages can Pub/Sub scale to ?

A

one million messages per second and beyond

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is Cloud Datalab ?

A

It runs in a Compute Engine virtual machine. To get started, you specify the virtual machine type you want and what GCP region it should run in. When it launches, it presents an interactive Python environment that’s ready to use

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Cloud Datalab is integrated with?

A

. It’s integrated with BigQuery, Compute Engine, and Cloud Storage, so accessing your data doesn’t run into authentication hassles

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Cloud Machine Learning Platform provides ______ .

A

modern machine learning services with pre-trained models and a platform to generate your own tailored models

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Cloud machine learning falls into what two categories generally?

A

Based on structured data, you can use ML for various kinds of classification and regression tasks like customer churn analysis, product diagnostics and forecasting. It can be the heart of a recommendation engine for content personalization and cross-sells and up-sells

Unstructured data, you can use ML for image analytics such as identifying damaged shipment, identifying styles and flagging content. You can do text analytics too, like a call center, blog analysis, language identification, topic classification and sentiment analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Cloud Vision API enables ….

A

Cloud Vision API enables developers to understand the content of an image. It quickly classifies images into thousands of categories - sailboat, lion, Eiffel Tower - detects individual objects within images, and finds and reads printed words contained within images

17
Q

The Cloud Speech API enables

A

developers to convert audio to text