Data Preprocessing Flashcards

1
Q

What are other ways to evaluate an AutoML model?

A

Precision, Recall, Confusion Matrix, use Precision-Recall curve to decide score threshold.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Difference between Colab Enterprise vs Vertex AI Notebook

A

Colab Enterprise: A collaborative, managed notebook environment with security and compliance capabilities of Google Cloud. Choose this if your project’s priorities are collaboration and avoiding infrastructure management.

Vertex AI Workbench: A Jupyter notebook-based environment provided through VM instances supporting the entire data science workflow. Choose this if your project’s priorities are control and customizability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What platforms or features does Vertex AI Workbench support?

A

Importing conda environments, access data from Cloud Storage or BigQuery, automated notebook runs and idle shutdown, custom containers, third party credentials, monitoring instance, full control over infrastructure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is Memorystore?

A

Fully managed Redis and Memcached for sub-millisecond data access.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is Firestore?

A

Highly-scalable, massively popular document database service for mobile, web, and server development that offers richer, faster queries and high availability up to 99.999%. Stores documents.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is Bigtable?

A

Highly performant, fully managed NoSQL database service for large analytical and operational workloads. Stores key-values and supports migrating from Hadoop or Spark.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is Cloud SQL?

A

Fully managed MySQL, PostgreSQL, and SQL Server. Simplifies migrations to Cloud SQL from MySQL, PostgreSQL, and Oracle databases with Database Migration Service.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is Spanner?

A

Cloud-native with unlimited scale, global consistency, and up to 99.999% availability. Stores structured data with horizontal scalability of unstructured data. Use cases include Gaming, Retail, Global financial ledger, Supply chain/inventory management.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is BigQuery?

A

Serverless, highly scalable, and cost-effective multicloud data warehouse designed for business agility, offering up to 99.99% availability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Why Cloud SQL over BigQuery?

A

Cloud SQL is a storage solution for low-latency transactional operations (write-heavy), while BigQuery is an analytics solution for analyzing databases and generating reports (read-heavy).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is Datastream?

A

Capture and replicate data from MySQL, PostgreSQL, AlloyDB, SQL Server, and Oracle databases into Google Cloud services. Serverless, no need to manage instances.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

In Data Preprocessing, when should you use BigQuery?

A

When handling tabular or structured data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

In Data Preprocessing, when should you use Dataflow?

A

When handling unstructured data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

In Data Preprocessing, when should you use TensorFlow Extended?

A

When you want to use the TensorFlow ecosystem.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What would you use Cloud Storage for?

A

For storing images, videos, audio, and other unstructured data in large container formats.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How can you use Cloud NAT to create secure notebooks on Vertex AI?

A

Cloud NAT allows outbound connections and inbound responses to those connections, but does not permit unsolicited inbound requests from the internet.

17
Q

If you create a Virtual Private Cloud network with Google Cloud, how does your instance get outbound internet access?

A

Use a regional Cloud Router and Cloud NAT gateway.