Kafka Extended API Flashcards
What are Source Connectors used for?
To get data from Common Data Sources
Source Connectors are responsible for ingesting data into Kafka.
What are Sink Connectors used for?
To publish data in Common Data Stores
Sink Connectors send data from Kafka to external systems.
What is a task in Kafka Connect?
A task is linked to a connector configuration and executes tasks defined by the connector.
What is a Kafka Connect Worker?
A worker is a single Java process that executes tasks.
What is Standalone Mode in Kafka Connect?
A single process runs connectors and tasks, easy for development but lacks fault tolerance and scalability.
What is Distributed Mode in Kafka Connect?
Multiple workers run connectors and tasks, easy to scale and fault tolerant.
Define a stream in Kafka.
A sequence of immutable data records that is fully ordered, can be replayed, and is fault tolerant.
What is a stream processor?
A node in the processor topology that transforms incoming streams, record by record, and may create a new stream from it.
What is a topology in Kafka?
A graph of processors chained together by streams.
What is a Source Processor?
A processor that takes its data directly from a Kafka Topic.
What is a Sink Processor?
A processor that sends stream data directly to a Kafka topic.
What characterizes KStreams?
All inserts, similar to a log, and represent an infinite, unbounded data stream.
What characterizes KTables?
All upserts on non-null values, deletes on null values, and are similar to a database table.
When should you use KStreams?
When reading from a topic that’s not compacted and new data is partial information.
When should you use KTables?
When reading from a log compacted topic or needing a structure like a database table.
Define stateless transformation.
A transformation where the result only depends on the data-point being processed.
Define stateful transformation.
A transformation where the result depends on external information or state.