3_Pub/Sub Flashcards

1
Q

Tightly-Coupled System

Tightly (direct) coupled systems more likely to fail.

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Loosely-Coupled System

Loosely coupled systems with ‘buffer’ scale have better fault tolerance.

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is Cloud Pub/Sub?

  • Global-scale messaging buffer/coupler.
  • Serverless, NoOps (fully managed), global availability, auto-scaling.
  • Decouples senders and receivers.
  • Real-time or batch.
  • 500 million messages per second
  • 1TB/s of data
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Pub/Sub Terminology

Topics, Messages, Publishers, Subscribers, Message Store

Messages are base64 encoded and 10Mb or less

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Push and Pull

  • Pub/Sub can either push messages to subscribers, or subscribers can pull messages from Pub/Sub (default).
  • Push = lower latency, more real-time.
  • Push subscribers must be Webhook endpoints that accept POST over HTTPS.
  • Pull is ideal for large volumes of messages, and uses batch delivery.
  • Pull is preferred if efficiency and throughput of message processing is required.
    • In push delivery, one message per request is sent.
  • Pulled messages must be acknowledged.
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

IAM

  • IAM allows for controlling access at project, topic or subscription level
    • Admin/Owner: project, snapshot, subscription, topic level
    • Editor: project, snapshot, subscription, topic level
    • Viewer: project, snapshot, subscription, topic level
    • Publisher: topic level
    • Subscriber: snapshot, subscription, topic level
  • Service accounts are best practices
  • Grant per-topic or per-subscription permissions
  • Grant limited access to publish or consume messages.

https://cloud.google.com/pubsub/docs/access-control

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

At Least Once Delivery

  • Each message is delivered at least once for every subscription.
  • Undelivered messages are deleted after the message retention duration (range is from 10 minutes to 7 days, with 7 days being default).
  • Messages published before a subscription is created will not be delivered to that subscription.
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Out of Order Messaging

  • Messages may arrive from multiple sources out of order.
  • Pub/Sub does not care about message ordering.
  • Dataflow is where out of order messages are processed/resolved.
  • It is possible to add message attributes to help with ordering, e.g. timestamps.
  • Consider alternatives for transactional ordering.
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Subscription Lifecycle

  • Subscriptions expire after 31 days of inactivity.
  • New subscriptions with the same name have no relationship to the previous subscription.
  • A snapshot on the subscription is the easiest way to safeguard against application deployments, by providing point-in-time recovery. If the previous version of the application needs to be re-deployed, the subscription can be rolled-back to the point in time of the snapshot, and all subsequent messages will be re-processed.
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Common Applications

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Connecting Kafka to GCP

Does Pub/Sub replace Kafka?

  • Not always
  • Hybrid workloads:
    • Interact with existing tools and frameworks
    • Don’t need global/scaling capabilities with Pub/Sub
  • Can use both: Kafka for on-premises and Pub/Sub for GCP in same data pipeline

How do we connect Kafka to GCP?

Overview on Connectors:

  • Open-source plugins that connect Kafka to GCP
  • Kafka Connect: one optional “connector service”
  • Exist to connect Kafka directly to Pub/Sub, Dataflow and BigQuery (among others)

Additional Terms

  • Source connector: An upstream connector: Streams from something to Kafka
  • Sink connector: A downstream connector: Streams from Kafka to something
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly