Collection Flashcards
1
Q
KDS
A
- Retention 1-365 days
- Record = Partition Key + Data Blob 1MB
- Provisioned
- IN : 1MB per shard per sec
- OUT : 2MB per shard per sec
- On-demand
- 4MB or 4000 records per second
- scales automatically based on throughput during last 30 days
- replicates to 3 AZ
2
Q
Kinesis Producer SDK
A
- Use Cases : Support multiple programming languages
- PutRecord vs PutRecords
- PutRecords uses batching and increase throughput
- ProvisionedThroughputExceeded Exception
- Solution : Retries with backoff, increase # shards and choice of partition key
3
Q
Kinesis Producer Library
A
- Use Cases : High performance and long-running producers
- Synchronous and Asynchronous API
- Batching –> 1MB/s or 1000 records /s
- Compression must be implemented by users
- KPL records must be decoded with KCL or special helper library
- RecordMaxBufferedTime 100ms
4
Q
Kinesis Agent
A
- Use Cases : Monitor log files and send them to KDS
- On top of KPL
- Features
- write from multiple directories to multiple kinesis streams
- preprocess data before sending
- Able to handle file rotation, checkpointing and retry
- Emit metrics to CloudWatch for monitoring
5
Q
Kinesis Consumer SDK
A
- 2MB per shard per second
- GetRecords returns up to 10MB /sec or up to 1000 records per second
- Max 5 GetRecords API
- 200ms latency
6
Q
Kinesis Client Library
A
- Read records from Kinesis produced by KPL
- Share multiple shards with multiple consumer in one group
- Checkpointing feature to resume progress
- Leverage DynamoDB for checkpointing
- Make sure to provision enough WCU / RCU
- Use on-demand for DynamoDB otherwise DynamoDB will slow down KCL
- ExpiredIterationException
- Solution : increase WCU
7
Q
Kinesis Connector Library
A
- S3
- DynamoDB
- Redshift
- ElasticSearch
8
Q
Kinesis and Lambda
A
- Lambda can source records from KDS
- Lambda consumer has library to de-aggregate record from the KPL
9
Q
Kinesis Enhanced Fan Out
A
- 2MB /consumer /sec /shard
- Kinesis pushes data to consumer over HTTP2
- 70 ms latency
- Default limit of 5 consumers using enhanced fan out per data stream
- Use SubscribeToShard API
10
Q
Auto Scaling
A
- API call to change the number of shards is UpdateShardCount
- We can implement AutoScaling with AWS Lambda
11
Q
KDS Security
A
- EIF : SSL
- EAR : KMS
- VPC
- KCL –> grant read and write access to DynamoDB table
12
Q
Kinesis Data Firehose
A
- Fully managed
- Near real time (60 sec latency)
- Auto scaling
- Spark / KCL do not read from KDF
- Destination : s3, Splunk, Redshift, ElasticSearch
- Record Size 1MB
- Replicates records to 3 AZ
- Retention 24 hours
13
Q
KDF Buffer
A
- 2 mins
- 32MB
14
Q
SQS Standard
A
- Fully managed
- 1-14 days retention
- 10ms latency
- 256KB msg body + metadata
- Horizontal scaling in term of number of consumer
- Max 120,000 in-flight messages being processed by consumers
15
Q
SQS Producing Messages
A
- Provide delay delivery
- Get back
- msg id
- md5 hash of the body
16
Q
SQS Consuming Messages
A
- Poll 10 msg at a time
- Process the message within the visibility timeout
- Delete the msg using msg id and recipt handler
- max 120,000 in-flight msg being processed by consumers
17
Q
SQS FIFO Queue
A
- Name of queue must end in .fifo
- Lower throughput (30,000 msg per sec with batching and 3000 per second without)
- messages are processed in order by consumer
- msg are sent exactly once