Managed Streaming for Apache Kafka Flashcards
What is the default message size in AWS Kafka?
1 MB
Can message size be increased in AWS Kafka?
You can configure Apache Kafka to be able to send and receive large messages, for example, up to 10 megabytes
Kinesis has a hard limit of 1 megabyte per message
What is Kafka in AWS?
Alternative to Kinesis (Kafka vs Kinesis next lecture)
* Fully managed Apache Kafka on AWS
* Allow you to create, update, delete clusters
* MSK creates & manages Kafka brokers nodes & Zookeeper nodes for you
* Deploy the MSK cluster in your VPC, multi AZ (up to 3 for HA)
* Automatic recovery from common Apache Kafka failures
* Data is stored on EBS volumes
* You can build producers and consumers of data
MSK Configuration?
To set up a private cluster, you need to choose the number of availability zones recommended which is either two or three. Next, select the VPC and subnets. You also need to choose the broker instance type, for instance, m5.large, and determine the number of brokers per AZ. You can add more brokers over time. This setup results in one zookeeper and one Kafka broker per availability zone or two per AZ. For example, with three AZs, you have three Zookeeper nodes and six Kafka brokers. Finally, you need to choose the EBS volume size, which can range from 1 gigabyte to 16 terabytes. This enables you to retain data for as long as you need based on the time requirements. This provides more flexibility compared to Kinesis data streams.
Exam Question
Kafka Security?
Security is crucial in Apache Kafka, and you may be asked about it on the exam.
- In-flight encryption between brokers can be achieved using TLS, which is enabled by default but can be disabled for performance improvements.
- Optional TLS encryption can also be used for in-flight encryption between clients and brokers, which is also enabled by default but can be disabled for performance reasons.
- At-rest encryption for EBS volumes can be achieved using KMS, and network security can be enforced by attaching security groups to Kafka clients.
- Authentication and authorization are critical aspects of Kafka security.
- There are three mechanisms available for authentication and authorization: MutualTLS, SASL/SCRAM, and IAM Access Control.
- MutualTLS uses TLS certificates for both encryption and authentication, and Kafka ACLs are used for authorization at the topic level.
- SASL/SCRAM uses name/password authentication, and Kafka ACLs or IAM Access Control can be used for authorization.
- IAM Access Control allows for both authentication and authorization using IAM policies.
- Kafka ACLs for MutualTLS and SASL/SCRAM must be defined from within the Kafka cluster and cannot be managed using IAM policies.
MSK - Monitoring?
CloudWatch Metrics
* Basic monitoring (cluster and broker metrics)
* Enhanced monitoring (++enhanced broker metrics)
* Topic level monitoring (++enhanced topic level metrics)
Prometheus (Open Source Monitoring)
* Opens a port on the broker to export cluster, broker and topic level metrics
* Setup the JMX Exporter (metrics) or Node Exporter (CPU and disk metrics)
Broker Log Delivery
* Delivery to CloudWatch Logs
* Delivery to Amazon S3
* Delivery to Kinesis Data Streams
What is the default protocol for in-flight encryption between Kafka brokers?
TLS
Can TLS encryption between Kafka clients and brokers be disabled?
Yes
What is the mechanism for encryption at rest for EBS volumes in Kafka?
KMS
How can network security be enforced for Kafka clients?
By attaching security groups
What are the critical aspects of Kafka security?
Authentication and authorization
What are the three available mechanisms for authentication and authorization in Kafka?
MutualTLS, SASL/SCRAM, IAM Access Control
What is MutualTLS?
TLS certificates used for encryption and authentication
What is used for authorization in MutualTLS?
Kafka ACLs at the topic level
What is SASL/SCRAM?
Name/password authentication mechanism