Chapter 1: Meet Kafka Flashcards

1
Q

Batches can contain messages from multiple partitions. T/F

A

F

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Batches can contain messages from multiple topics. T/F

A

F

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the benefit of a larger batch size?

A

Increased throughput

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the benefit of a smaller batch size?

A

Decreased latency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How does Kafka reduce the bytes in a batch before sending it across the network

A

Compression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the 3 most common schema types

A

JSON, XML, Arvo

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do Arvo messages achieve a smaller size than JSON or XML messages?

A

Separating message payload and schema

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Are messages guaranteed to be ordered within a topic?

A

No

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Are messages guaranteed to be ordered within a partition?

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do partitions increase scalability and redundancy?

A

Splitting partitions across servers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

A producer is about to produce a message which has no key, and the producer is not using a custom partitioner. How does the producer decide which partition to use?

A

The producer will distribute the message evenly across partitions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Describe two different methods of ensuring that two messages will be written to the same partition?

A

Use a custom partitioner or give both messages the same key

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

If I were to produce two messages to a topic, how could I ensure that those messages were consumed in the same order they were produced?

A

Put the messages in the same partition

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the data type of an offset?

A

An integer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

When a consumer restarts, how does it decide which message it should start reading

A

It reads the offset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the two places that the offset could be stored?

A

Zookeeper or Kafka

17
Q

What is the cardinality between consumer groups and partitions?

A

One-to-many

18
Q

How could one increase the throughput of a consumer group?

A

Adding more consumers

19
Q

Consumer A owns partition B. How does Kafka ensure that partition B will continue to be processed in the event that consumer A dies?

A

Kafka re-assigns partition B to another consumer from the group of consumer A

20
Q

True or false? A single broker can handle millions of partitions.

A

False

21
Q

True or false? A single broker can handle thousands of partitions.

A

True

22
Q

True or false? A single broker can handle millions of messages per second.

A

True

23
Q

Within a cluster, how many brokers are responsible for assigning partitions to consumers?

A

One

24
Q

What is the name of a broker that is responsible for assigning partitions to brokers

A

Controller

25
Q

Fill in the blank: all producers and consumers of a partition can be connected to a single broker, called the ___

A

Leader

26
Q

How does Kafka ensure redundancy of messages in a partition?

A

By replicating the partition in multiple brokers

27
Q

Describe two simple ways that I could use to make a Kafka topic store messages for 1 month?

A

Change the broker retention setting or topic retention setting to 1 month

28
Q

How could I limit the size of data (in bytes) stored in a Kafka topic to 1 GB, without affecting other topics?

A

Change the topic retention settings

29
Q

How could I limit a Kafka topic so that only the most recent message is stored

A

Change the topic to be log compacted

30
Q

Name 3 ways that one could increase the throughput of a topic?

A

It depends on the bottleneck. Options include:
- Increase batch size
- Increase number of consumers
- Increase number of brokers
- Increase number of partitions
- Increase computing power of servers
- Make consumer processing code more efficient
- Decrease message size

31
Q

A topic is experiencing high latency between the producer and the consumer. What could be done to reduce this latency?

A

Decrease batch size, decrease message size, address any bottlenecks in the system (e.g not enough brokers)

32
Q

How could I enforce the processing order of two messages?

A

Put the messages in the same topic and partition

33
Q

A topic was being processed quickly but a spike in message frequency has increased the processing time of messages. The brokers have plenty of computing power to spare so they are not the bottleneck. How could I increase the speed at which messages are processed without altering the code which processes messages?

A

Increase the number of consumers and/or number of partitions