Test 1 Flashcards
In avro, adding a field to a record without default is a __ schema evolution
Forward
Explanation
Clients with old schema will be able to read records saved with new schema
A consumer wants to read messages from a specific partition of a topic. How can this be achieved?
Call assign() passing a collection of TopicPartitions as the argument
assign() can be used for manual assignment of a partition to a consumer, in which case subscribe() must not be used. Assign() takes a collection of TopicPartition object as an argument (https://kafka.apache.org/23/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html#assign-java.util.Collection-)
We would like to be in an at-most once consuming scenario. Which offset commit strategy would you recommend?
Commit the offsets in Kafka, before processing the data
Explanation
Here, we must commit the offsets right after receiving a batch from a call to .poll()
The exactly once guarantee in the Kafka Streams is for which flow of data?
Kafka => Kafka
Explanation
Kafka Streams can only guarantee exactly once processing if you have a Kafka to Kafka topology.
To import data from external databases, I should use
Kafka Connect Source
Explanation
Kafka Connect Sink is used to export data from Kafka to external databases and Kafka Connect Source is used to import from external databases into Kafka.
You want to sink data from a Kafka topic to S3 using Kafka Connect. There are 10 brokers in the cluster, the topic has 2 partitions with replication factor of 3. How many tasks will you configure for the S3 connector?
2
Explanation
You cannot have more sink tasks (= consumers) than the number of partitions, so 2.
What is the risk of increasing max.in.flight.requests.per.connection while also enabling retries in a producer?
Message order is not preserved
Explanation
Some messages may require multiple retries. If there are more than 1 requests in flight, it may result in messages received out of order. Note an exception to this rule is if you enable the producer setting: enable.idempotence=true which takes care of the out of ordering case on its own. See: https://issues.apache.org/jira/browse/KAFKA-5494
What client protocol is supported for the schema registry? (select two)
HTTP
HTTPS
Explanation
clients can interact with the schema registry using the HTTP or HTTPS interface
Where are the dynamic configurations for a topic stored?
In Zookeeper
Explanation
Dynamic topic configurations are maintained in Zookeeper.
Which of the following errors are retriable from a producer perspective? (select two)
NOT_LEADER_FOR_PARTITION
NOT_ENOUGH_REPLICAS
Explanation
Both of these are retriable errors, others non-retriable errors.
See the full list of errors and their “retriable” status here: https://kafka.apache.org/protocol#protocol_error_codes
There are two consumers C1 and C2 belonging to the same group G subscribed to topics T1 and T2. Each of the topics has 3 partitions. How will the partitions be assigned to consumers with PartitionAssignor being RoundRobinAssignor?
C1 will be assigned partitions 0 and 2 from T1 and partition 1 from T2. C2 will have partition 1 from T1 and partitions 0 and 2 from T2.
Explanation
The correct option is the only one where the two consumers share an equal number of partitions amongst the two topics of three partitions. An interesting article to read is: https://medium.com/@anyili0928/what-i-have-learned-from-kafka-partition-assignment-strategy-799fdf15d3ab
is KSQL ANSI SQL compliant?
No
Explanation
KSQL is not ANSI SQL compliant, for now there are no defined standards on streaming SQL languages
A bank uses a Kafka cluster for credit card payments. What should be the value of the property unclean.leader.election.enable?
False
Explanation
Setting unclean.leader.election.enable to true means we allow out-of-sync replicas to become leaders, we will lose messages when this occurs, effectively losing credit card payments and making our customers very angry.
You want to perform table lookups against a KTable everytime a new record is received from the KStream. What is the output of KStream-KTable join?
KStream
Explanation
Here KStream is being processed to create another KStream.
What isn’t a feature of the Confluent schema registry?
Store Avro data
Explanation
Data is stored on brokers.
If I want to send binary data through the REST proxy to topic “test_binary”, it needs to be base64 encoded. A consumer connecting directly into the Kafka topic “test_binary” will receive
Binary data
Explanation
On the producer side, after receiving base64 data, the REST Proxy will convert it into bytes and then send that bytes payload to Kafka. Therefore consumers reading directly from Kafka will receive binary data.
A topic receives all the orders for the products that are available on a commerce site. Two applications want to process all the messages independently - order fulfilment and monitoring. The topic has 4 partitions, how would you organise the consumers for optimal performance and resource usage?
Create 2 consumer groups for 2 applications with 4 consumers each
Explanation
two partitions groups - one for each application so that all messages are delivered to both the application. 4 consumers in each as there are 4 partitions of the topic, and you cannot have more consumers per groups than the number of partitions (otherwise they will be inactive and wasting resources)
If I want to have an extremely high confidence that leaders and replicas have my data, I should use
acks=all, replication factor=3, min.insync.replicas=2
Explanation
acks=all means the leader will wait for all in-sync replicas to acknowledge the record. Also the min in-sync replica setting specifies the minimum number of replicas that need to be in-sync for the partition to remain available for writes.
Which is an optional field in an Avro record?
doc
Explanation
doc represents optional description of message
Which of the following setting increases the chance of batching for a Kafka Producer?
increase linger.ms
Explanation
linger.ms forces the producer to wait to send messages, hence increasing the chance of creating batches