all Flashcards

1
Q

Which are the security protocols

A
  • PLAINTEXT
  • SSL
  • SASL_PLAINTEXT
  • SASL_SSL
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

broker.id

A
  • Broker config
  • General broker parameter
  • integer
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

listeners

A
  • Broker config
  • General broker parameter
  • comma-separated list of URIs
  • URI look like:
    <protocol>://<hostname>:<port> e.g. SSL://localhost:9091
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what happens if a broker’s listener port is lower than 1024

A

Kafka must be started as root

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

listener.security.protocol.map

A
  • General broker parameter
  • configured if a listener name is not a common security protocol
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

zookepeer.connect

A
  • Broker config
  • General broker parameter
  • semicolon-separated (semicolon) list of hostname:port/path
    path is optional chroot path
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

log.dirs

A
  • Broker config
  • General broker parameter
  • the directories where log segments are stored
  • one partition’s log segments are stored within the same path
  • broker will store partitions in “least used” fashion
    -defaults to log.dir (singular) if missing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

num.recovery.threads.per.data.dir

A
  • Broker config
  • General broker parameter
  • num threads per log dir
  • threads are used to:
    • open log segment files
    • close log segment files
    • check and truncate log segment files after failure
  • safe to increase their number
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

auto.create.topics.enable

A
  • Broker config
  • General broker parameter
  • the broker will automatically create topic when:
  • producer starts writing
  • consumer starts reading
  • any client requests metadata
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

auto.leader.rebalance.enable

A
  • Broker config
  • General broker parameter
  • enables background thread checking distribution of partitions
  • seeks to avoid having topic leadership concentrated in one or few brokers
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

leader.imbalance.check.interval.seconds

A
  • Broker config
  • General broker parameter
  • every how many seconds the broker will check for partition leader imbalances
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

leader.imbalance.per.broker.percentage

A
  • Broker config
  • General broker parameter
  • if leadership imbalance exceeds this value, then a rebalance is initiated
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

delete.topic.enable

A
  • General broker parameter
  • dangerous
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

num.partitions

A
  • Broker config
  • topic default
  • defaults to 1
    -primarily used when auto topic creation is enabled
  • partitons can never be decreased
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

default.replication.factor

A
  • Broker config
  • topic default
  • if auto-topic creation enabled, this value sets the replication factor
  • should be at least 1 over the min.insync.replicas (RF+)
  • even better is RF++ to allow maintenance and prevent outages
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

log.retention.ms

A
  • Broker config
  • topic default
  • takes precedence over log.retention.minutes and log.retention.hours
  • how long kafka will retain messages
  • retention is performed by examining the last modified time on each log segment file on disk. The tome the log segment was closed.
  • this retention is on topic level
  • if log.retention.bytes has also been configured, messages may be removed when either criteria is met
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

log.retention.minutes

A
  • Broker config
  • topic default
  • takes precedence over log.retention.hours
18
Q

log.retention.hours

A
  • Broker config
  • topic default
  • see log.retention.ms
19
Q

log.retention.bytes

A
  • Broker config
  • topic default
  • applied per partition (bytes per partition, so adding partitions increases total topic retention size
  • can happen to have both this and log.retention.ms set… then messages may be removed when either criteria is met
20
Q

log.segment.bytes

A
  • Broker config
  • topic default
  • defaults to 1GB
  • once segment reaches the size soecified in the log.segment.bytes, the segment is closed and it can be considered for expiration
21
Q

log.roll.ms

A
  • Broker config
  • topic default
  • the amount of time after which a log segment should be closed
  • not mutually exclusive with log.segment.bytes
  • consider that multiple log segments will be closed at the same time (impact on disk performance) for low volume partitions
22
Q

min.insync.replicas

A
  • Broker config
  • topic default
  • defaults to 1
  • how many replicas need to acknowledge the write for it to be successful
  • setting it to 2 ensures 2 replicas are in sync with the producer
23
Q

message.max.bytes

A
  • Broker config
  • topic default
  • defaults to 1MB
  • messages larger than this value will not be accepted and producer will get error message
  • this value is the max size of a compressed message
  • must be coordinated with the configs:
    • fetch.message.max.bytes
    • replica.fetch.max.bytes
24
Q

Major factors for performance bottlenecks

A
  1. disk throughput
  2. disk capacity
  3. memory
  4. CPU
  5. networking
25
Q

Faster disk writes =

A

lower produce latency

26
Q

What part of memory is more important for Kafka

A

Page Cache, the heap is just for the JVM and 5GB will do for 150k messages / second and data rate of 200 megabits per second

27
Q

Why is there a networking imbalance

A

outbound traffic higher than inbound (many consumers for one producer). Recommended 10GB NICs

28
Q

Does Kafka need extremely performant CPU

A

No, kafka uses CPU to decompress message batches to validate the checksum and then recompresses the batches… that’s all

29
Q

Kafka per broker size recommendations

A
  • < 14K partition replicas
  • < 1M replicas per cluster
30
Q

Broker configuration requirements

A
  • all brokers must have same `zookeper.connect
  • all brokers must have unique `broker.id
31
Q

OS Tuning - Virtual Memory

A
  • set vm.swappiness = 1 (i.e. do not swap unless there is an out-of-memory condition)
  • vm.dirty_background_ratio = 5 (default is 10), it’s a % of total system memory)
  • vm.dirty_ratio = 60 to 80 (default is 20, % of total system memory before synchronous flush to disk.
  • vm.max_map_count = 400k to 600k (these are the files descriptor needed)
  • vm.overcommit = 0 (it’s the default)
32
Q

vm.swappiness

A
  • OS virtual memory setting
  • set vm.swappiness = 1 (i.e. do not swap unless there is an out-of-memory condition)
33
Q

vm.dirty_background_ratio

A
  • OS virtual memory setting
  • set vm.dirty_background_ratio = 5 (default is 10), it’s a % of total system memory allowed in dirty pages before process to flush them to disk starts)
34
Q

vm.dirty_ratio

A
  • OS virtual memory setting
  • set vm.dirty_ratio = 60 to 80 (default is 20, % of total system memory before synchronous flush to disk.
35
Q

vm.max_map_count

A
  • OS virtual memory setting
  • vm.max_map_count = 400k to 600k (these are the files descriptor needed)
36
Q

vm.overcommit

A
  • OS virtual memory setting
  • vm.overcommit = 0 (it’s the default). setting to 0 means the kernel determines the amount of free memory from an application
37
Q

OS tuning - Disk

A
  • XFS filesystem better tan Ext4
  • set `noatime mount option (i.e. no access-time writes. Disabling acces-time writes is safe)
  • set `largeio which improves efficiency for larger disk writes
38
Q

OS tuning - networking

A
  • increase socket buffer sizes
    1. net.core.wmem.default = 131072 (128KiB)
    2. net.core.rmem.default = 131072 (128KiB)
    3. net.core.wmem.max = 2097152 (2MiB)
    4. net.core.rmem.max = 2097152 (2MiB)
  • increase TCP socket buffer sizes
    1. net.ipv4.tcp_wmem=<min> <default> <max>
    2. net.ipv4.tcp_rmem=<min> <default> <max>
    e.g. 4096 65536 2048000 (4KiB, 64KiB, 2MiB)</max></default></min></max></default></min>
  • net.ipv4.tcp_window_scaling=1 allows more efficient data transfers
  • net.ipv4.tcp_max_syn_backlog= above 1024 allows more simultaneous connections
  • net.core.netdev_max_backlog= more than 1000 good for bursts of network traffic
39
Q

Kafka producer - mandatory properties

A
  1. bootstrap.servers
  2. key.serializer
  3. value.serializer
40
Q

kafka producer - primary send methods

A
  1. fire-and-forget
  2. synchronous send
  3. asynchronous send