Big Data Platform Flashcards

1
Q

What is Cloud Dataproc?

A
  • Google’s managed apache service
  • Billed by the second
  • Runs on clusters owned by the customer
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is Cloud Dataflow?

A
  • Managed etl pipelines

- Automated scaling and provisioning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is Big Query?

A
  • Analytics data warehouse
  • Allows SQL queries
  • Free monthly quota
  • Bounded to a region
  • Billing for storage and process used
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is Cloud Pub/Sub?

A
  • Messaging bus

- Good for cases where data arrives in high and unpredictable rates (iot for example)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is Cloud Datalab?

A
  • “jupyter playground” managed service

- pay for resources used, nmr of notebooks does not matter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is Cloud ML Platform?

A
  • Ml platform with use case specific managed apis like:
    • Cloud Vision (image analysis)
    • Cloud Natural Language (audio to text, reveal structure and meaning of text, extract information)
    • Cloud Translation
    • Cloud Video Intelligence (annotate videos)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly