Chapter 2: Installing Kafka Flashcards

1
Q

By default, Kafka will automatically create a topic when a producer, consumer, or client attempts to interact with it. T/F

A

T

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

A company is going to manage their topic creation using Pulumi. What Kafa configuration needs to be set so that producers, consumers, and clients cannot accidentally create topics without using Pulumi?

A

auto.create.topics.enable = false

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

A majority of topics in a cluster will need a retention time of 2 hours. There are two ways of configuring Kafka to achieve this. What are they?

A

Set the retention time for each topic or set the default topic retention time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

A Kafka cluster has automatic topic creation enabled. How can the Kafka cluster be configured so that every topic will have at least 10 partitions?

A

Set the num.partitions parameter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

A Kafka cluster is using the default configuration parameters. A topic is created automatically. How many partitions will this topic have?

A

One

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

It is possible to increase the number of partitions in an existing Kafka topic. T/F?

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

It is possible to decrease the number of partitions in an existing Kafka topic. T/F?

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

A Kafka cluster has the num.partitions parameter set to 10. How could I create a topic with only 5 partitions, without changing the num.partitions parameter?

A

Use manual topic creation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How could I ensure that message load is evenly distributed among brokers?

A

Set num.partitions to the number of brokers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How does the number of partitions affect throughput?

A

Throughput is maximized when the number of partitions is >= the number of brokers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How many consumers can read from a particular partition at once ?

A

Zero or one

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Messages are being produced in the “foo” topic at a rate of 30 messages/second. A consumer of the “foo” topic can consume 10 messages per second. There are 5 of these consumers. How many partitions should the “foo” topic have in order to maximize throughput?

A

At least 3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

You are planning out a new topic. The messages in this topic will be partitioned by key. You predict that you will need at least 10 partitions for maximum throughput in the short-term, but future changes could require as much as 100 partitions for maximum throughput. How should you configure the topic? Why?

A

Set the number of partitions to 100. Increasing the number of partitions later out would be challenging.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the negative consequences of using too many partitions?

A

Increased broker memory consumption. Increased time for leader elections.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the runtime software required to run ZooKeeper and Kafka?

A

Java

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the ideal server counts in a ZooKeeper ensemble?

A

3, 5, or 7

17
Q

Why should ZooKeeper ensembles use an odd server count?

A

Because a majority of members must be working in order to respond to requests

18
Q

Why should a ZooKeeper ensemble not use a single server?

A

Because if that server went down, then the ensemble could not respond to requests

19
Q

Why should a ZooKeeper ensemble not use 2 servers?

A

Because if either server went down, then the ensemble would not respond to requests

20
Q

What are the risks of running a three-node ZooKeeper ensemble?

A

During maintenance, only two nodes will be available to respond, so if there was an unexpected outage then the ensemble would be unable to respond to requests

21
Q

What is the problem with running more than 7 nodes in a ZooKeeper ensemble?

A

The consensus protocol does not scale well to more than 7 nodes

22
Q

In a ZooKeeper configuration file, what does the initLimit setting do?

A

It sets the maximum time allowed for a follower to establish a connection with a leader

23
Q

A ZooKeeper configuration file contains the following:

tickTime=2000
initLimit=20

How many seconds will the follower be alloted to establish a connection with the leader?

A

40 seconds

24
Q

In a ZooKeeper configuration file, what does the syncLimit setting do?

A

It sets the maximum drift between followers and the leader

25
Q

How does a ZooKeeper server know its ID?

A

It reads the myid file in the data directory

26
Q

A ZooKeeper configuration file contains

server.1=zoo1.example.com:2888:3888

What does that do?

A

The ZooKeeper server with ID 1 has hostname zoo1.example.com, peer port 2888, and leader port 3888