Chapter 2: Installing Kafka Flashcards
By default, Kafka will automatically create a topic when a producer, consumer, or client attempts to interact with it. T/F
T
A company is going to manage their topic creation using Pulumi. What Kafa configuration needs to be set so that producers, consumers, and clients cannot accidentally create topics without using Pulumi?
auto.create.topics.enable = false
A majority of topics in a cluster will need a retention time of 2 hours. There are two ways of configuring Kafka to achieve this. What are they?
Set the retention time for each topic or set the default topic retention time
A Kafka cluster has automatic topic creation enabled. How can the Kafka cluster be configured so that every topic will have at least 10 partitions?
Set the num.partitions parameter
A Kafka cluster is using the default configuration parameters. A topic is created automatically. How many partitions will this topic have?
One
It is possible to increase the number of partitions in an existing Kafka topic. T/F?
True
It is possible to decrease the number of partitions in an existing Kafka topic. T/F?
False
A Kafka cluster has the num.partitions parameter set to 10. How could I create a topic with only 5 partitions, without changing the num.partitions parameter?
Use manual topic creation
How could I ensure that message load is evenly distributed among brokers?
Set num.partitions to the number of brokers
How does the number of partitions affect throughput?
Throughput is maximized when the number of partitions is >= the number of brokers
How many consumers can read from a particular partition at once ?
Zero or one
Messages are being produced in the “foo” topic at a rate of 30 messages/second. A consumer of the “foo” topic can consume 10 messages per second. There are 5 of these consumers. How many partitions should the “foo” topic have in order to maximize throughput?
At least 3
You are planning out a new topic. The messages in this topic will be partitioned by key. You predict that you will need at least 10 partitions for maximum throughput in the short-term, but future changes could require as much as 100 partitions for maximum throughput. How should you configure the topic? Why?
Set the number of partitions to 100. Increasing the number of partitions later out would be challenging.
What are the negative consequences of using too many partitions?
Increased broker memory consumption. Increased time for leader elections.
What is the runtime software required to run ZooKeeper and Kafka?
Java
What are the ideal server counts in a ZooKeeper ensemble?
3, 5, or 7
Why should ZooKeeper ensembles use an odd server count?
Because a majority of members must be working in order to respond to requests
Why should a ZooKeeper ensemble not use a single server?
Because if that server went down, then the ensemble could not respond to requests
Why should a ZooKeeper ensemble not use 2 servers?
Because if either server went down, then the ensemble would not respond to requests
What are the risks of running a three-node ZooKeeper ensemble?
During maintenance, only two nodes will be available to respond, so if there was an unexpected outage then the ensemble would be unable to respond to requests
What is the problem with running more than 7 nodes in a ZooKeeper ensemble?
The consensus protocol does not scale well to more than 7 nodes
In a ZooKeeper configuration file, what does the initLimit setting do?
It sets the maximum time allowed for a follower to establish a connection with a leader
A ZooKeeper configuration file contains the following:
tickTime=2000
initLimit=20
How many seconds will the follower be alloted to establish a connection with the leader?
40 seconds
In a ZooKeeper configuration file, what does the syncLimit setting do?
It sets the maximum drift between followers and the leader
How does a ZooKeeper server know its ID?
It reads the myid file in the data directory
A ZooKeeper configuration file contains
server.1=zoo1.example.com:2888:3888
What does that do?
The ZooKeeper server with ID 1 has hostname zoo1.example.com, peer port 2888, and leader port 3888