MSK Flashcards
What is Amazon MSK?
A fully managed service for Apache Kafka on AWS. It allows you to create, update, and delete clusters without managing the underlying infrastructure.
What are the key benefits of using Amazon MSK?
- Ease of management: MSK handles the creation and management of Kafka brokers, ZooKeeper nodes, and other infrastructure components.
- Scalability and availability: You can easily scale your MSK clusters up or down and deploy them across multiple Availability Zones for high availability.
- Security: MSK offers various security features, including encryption at rest and in transit, network security, and authentication and authorization mechanisms.
- Cost-effectiveness: You pay only for the resources you use, and there are no upfront costs.
How does Amazon MSK ensure high availability?
Amazon MSK deploys your cluster in multiple Availability Zones (up to 3) to ensure high availability. It also provides automatic recovery from common Apache Kafka failures.
What storage options are available for Amazon MSK?
Data in Amazon MSK is stored on EBS volumes, providing persistent storage for your Kafka data.
What is the default message size limit in Amazon MSK?
The default message size limit in Amazon MSK is 1 MB. However, you can configure it to support larger messages (e.g., 10 MB) through custom configurations.
How do you configure an Amazon MSK cluster?
When configuring an Amazon MSK cluster, you can choose the number of Availability Zones (3 recommended), VPC and subnets, broker instance type, number of brokers per AZ, and the size of your EBS volumes (1 GB - 16 TB).
What security features does Amazon MSK offer?
Amazon MSK provides several security features, including:
- Encryption: Optional in-flight encryption using TLS between brokers and between clients and brokers. At-rest encryption for your EBS volumes using KMS.
- Network security: Authorize specific security groups for your Apache Kafka clients.
- Authentication and authorization: Define who can read/write to which topics using Mutual TLS (AuthN) + Kafka ACLs (AuthZ), SASL/SCRAM (AuthN) + Kafka ACLs (AuthZ), or IAM Access Control (AuthN + AuthZ).
How can you monitor your Amazon MSK clusters?
Amazon MSK offers several monitoring options:
- CloudWatch Metrics: Provides basic monitoring (cluster and broker metrics), enhanced monitoring (enhanced broker metrics), and topic-level monitoring (enhanced topic-level metrics).
- Prometheus (Open-Source Monitoring): Opens a port on the broker to export cluster, broker, and topic-level metrics. You can set up the JMX Exporter (metrics) or Node Exporter (CPU and disk metrics).
- Broker Log Delivery: Delivers logs to CloudWatch Logs, Amazon S3, or Kinesis Data Streams.
What is Amazon MSK Connect?
Amazon MSK Connect is a managed service that allows you to run Kafka Connect workers on AWS. It offers auto-scaling capabilities for workers and supports any Kafka Connect connector as a plugin (e.g., Amazon S3, Amazon Redshift, Amazon OpenSearch, Debezium).
What is Amazon MSK Serverless?
Amazon MSK Serverless is a capability that allows you to run Apache Kafka on MSK without managing the capacity. MSK automatically provisions resources and scales compute and storage based on your needs. You only need to define your topics and partitions.
How does Amazon MSK Serverless handle security?
Amazon MSK Serverless uses IAM Access Control for all clusters, providing a simple and integrated way to manage access to your Kafka resources.
What are the key differences between Kinesis Data Streams and Amazon MSK?
- Kinesis Data Streams: 1 MB message size limit, data streams with shards, shard splitting & merging, TLS in-flight encryption, KMS at-rest encryption, IAM policies for AuthN/AuthZ.
- Amazon MSK: 1 MB default message size (configurable for higher), Kafka topics with partitions, can only add partitions to a topic, PLAINTEXT or TLS in-flight encryption, KMS at-rest encryption, Mutual TLS (AuthN) + Kafka ACLs (AuthZ), SASL/SCRAM (AuthN) + Kafka ACLs (AuthZ), IAM Access Control (AuthN + AuthZ).
What are the pricing models for Amazon MSK and Amazon MSK Serverless?
- Amazon MSK: You pay for the broker instance hours and the EBS storage used.
- Amazon MSK Serverless: You pay an hourly rate per cluster, an hourly rate per partition, and a monthly rate per GB of storage and data transfer.
When should you choose Amazon MSK over Kinesis Data Streams?
Choose Amazon MSK when you need:
- Full compatibility with Apache Kafka.
- More control over your Kafka configurations.
- Support for larger message sizes.
- A wider range of authentication and authorization mechanisms.
When should you choose Kinesis Data Streams over Amazon MSK?
Choose Kinesis Data Streams when you need:
- A simpler, fully managed service with fewer configuration options.
- Integration with other AWS services like Kinesis Data Analytics and Kinesis Data Firehose.
- A cost-effective solution for streaming data at a smaller scale.