Kafka Producer Flashcards
What is ProducerRecord?
ProducerRecord is combination of header + payload that the producer uses to send data to Kafka.
What does ProducerRecord contain?
Producer record contains mandatory and optional parameters.
1) Topic (mandatory)
2) Key (optional)
3) Parition No (optional)
4) Payload (mandatory)
What is the use of broker-list parameter that is provided to Producer?
Using that broker list, the producer will discover about cluster and connect with all of the brokers in the cluster.
How many brokers is Producer connected to in cluster?
Producer is connected with all of the brokers in cluster.
What is the flow of sending message in Kafka?
1) Create ProducerRecord
2) Serialize
3) Partitioner -> partition is identified
4) Adds record to batch of records that will also be sent to same topic and partition.
5) Separate thread is responsible for sending those batch messages to appropriate Kafka Brokers.
6) If successfull then RecordMetadata is returned
7) If failure then either retry is done or exception.
What does RecordMetadata contain?
Record metadata is returned when a message is successfully published to Kafka. It contains topic, partition and the offset of the record within the partition.
When a leader of partition goes down, whose responsibility is it to retry to send to newly elected leader?
It is the producer’s responsibility to try to send to newly elected leader for that partition.
Lets say there are 3 replicas configured and one of the broker is leader for that partition? How is data kept in sync with replicas? Is it push model or pull model?
It is pull model. It’s not the responsibility of leader partition to send data to replicas. Replicas pull information from leader.
Which are important producer configurations?
Mandatory
1) Bootstrap servers - To which the produer will communicate to find out information about the cluster
2) KeySerializer - Using which the key will be serialized
3) ValueSerializer - Useful for custom class.
Provide Java template to create a new Producer with properties
Properties props = new Properties();
props. put(“bootstrap.servers”, “192.168.56.101:9101”);
props. put(“key.serializer”, “org.apache.kafka.common.serialization.StringSerializer”);
props. put(“value.serializer”, “org.apache.kafka.common.serialization.StringSerializer”);
Producer producer = new KafkaProducer<>(props);
What category of errors can be there in Kafka?
1) Retriable errors for example “no leader” which can be resolved by electing a new leader or “connection error” which can be resolved by reestrablishing a connection are retryable errors. KafkaProducer can be configured to retry on those errors automatically.
2) Non-retriable/permanent errors - are errors whch cannot be resolved and are not transient. Like “message size too large” . In this case KafkaProducer will not retry and throw exception immediately.
Provide Java Template code to send message in Kafka
Producer producer = new KafkaProducer<>(props);
ProducerRecord record = new ProducerRecord<>(“topic”, “key”, “value”);
producer.send(record);
What is ack property in producer?
Ack property controls how many successful write receipts do we wait for before moving to other record.
0 - Disabled, so we won’t get ack and will move on to other record. Fire and forget. There is no guarantee that message will even reach there.
1 - Receive from .Atleast one broker should reply with successfull write.
2 - Receive from all the brokers. In that case we will wait till we get ack response from n servers, n being replication factor. This will be very slow.
Which are some exceptions that we can get before message is sent to the broker
SerializationException, BufferExhaustedException, TimeoutException or InterruptException are some of the exceptions.
Which are the three ways to send message to Kafka
1) Fire and Forget - send message without caring if it was successful or not. Most will if Kafka is fault tolerant.
2) Synchronous send - We send message and we get a future in return on which we can do get to find if send was successful or not.
3) Asynchronous Send - We can attach callback and it will be called as soon as resopnse is received.