Documentation, FAQs Flashcards
X-RAY
Which are the languages supported for Services such as EC2?
You can use X-Ray with applications written in - Java, - Node.js - .NET - (Go, Ruby, Python) that are deployed on these services.
X-RAY
Which are the supported services?
AWS X-Ray works with
- Amazon EC2
- Amazon EC2 Container Service (Amazon ECS)
- AWS Lambda
- Amazon SQS
- Amazon SNS
- AWS Elastic Beanstalk
X-Ray
The there a region limit for the service?
No, with X-Ray, you can trace requests made to applications that span multiple AWS accounts, AWS Regions, and Availability Zones.
X-Ray
What needs to be done to enable X-Ray on Elastic Beanstalk?
You only have to integrate the X-Ray SDK with your application since the X-Ray agent is pre-installed on Elastic Beanstalk.
X-Ray
What is a Service map?
Visual representation of the data flow in the services, which enables a high-level overview, but also allows to drill down into issues
X-Ray
What is a Filter Expression?
Way to filter traces to specific use cases, for example traces that too more than 5seconds or where a 5xx error was thrown
X-Ray
What is the use of the X-Ray daemon ?
Instead of sending data directly into X-Ray the daemon buffers segments in a queue and uploads them in batches.
The daemon is available for Linux, Windows, and macOS, and is included on AWS Elastic Beanstalk and AWS Lambda platforms.
X-Ray
What is a segment?
An X-Ray segment encapsulates all the data points for a single component.
A Segment is the result of a Request, it includes:
- The host – hostname, alias or IP address
- The request – method, client address, path, user agent
- The response – status, content
- The work done – start and end times, subsegments
- Issues that occur – errors, faults and exceptions, including automatic capture of exception stacks.
Segment documents can be up to 64 kB in size.
Segments include system-defined and user-defined data in the form of annotations and are composed of one or more sub-segments that represent remote calls made from the service.
X-Ray
What is a subsegment?
Subsegments provide more granular timing information and details about downstream calls
This lets you see all of your downstream dependencies, even if they don’t support tracing, or are external
Can contain additional details about a call to an AWS service, an external HTTP API, or an SQL database.
You can even define arbitrary subsegments to instrument specific functions or lines of code in your application.
X-Ray
What are Traces?
A trace ID tracks the path of a request through your application. A trace collects all the segments generated by a single request. That request is typically an HTTP GET or POST request that travels through a load balancer, hits your application code, and generates downstream calls to other AWS services or external web APIs. The first supported service that the HTTP request interacts with adds a trace ID header to the request, and propagates it downstream to track the latency, disposition, and other request data.
X-Ray
What is the TraceId?
Send as X-Amzn-Trace-Id it contains the root id, sampling info and additionally the parent segment ID
X-Ray
What are annotations?
Annotations are simple key-value pairs that are indexed for use with filter expressions.
Use annotations to record information on segments or subsegments that you want indexed for search.
X-Ray
What is the main difference between Annotations and Metadata?
Metadata is not indexed, therefore not used for searching with filter expressions
X-Ray
How long does it take for trace data to be available in X-Ray?
Generally available for retrieval and filtering within 30 seconds of it being received by the service.
X-Ray
How far back can I query the trace data? How long does X-Ray store trace data for?
X-Ray stores trace data for the last 30 days.
This enables you to query trace data going back 30 days.
X-Ray
Are there partial traces?
In some situations (connectivity issues, delay in receiving segments, and so on) it is possible that trace information provided by the X-Ray APIs will be partial. In those situations, X-Ray tags traces as incomplete or partial.
KMS
How to encrypt/decrypt locally?
The AWS Encryption SDK supports AWS KMS as a root key provider for developers who need to encrypt/decrypt data locally within their applications.
KMS
What needs to be done after a key is automatically rotated?
Nothing, the service automatically keeps older versions of the root key available to decrypt previously encrypted data
KMS
What is an Asymmetric Key?
For symmetric keys the same key is used for encryption and decryption - for asymmetric key that is a public and a private key.
The public key is send to the user, while the private key does not leave HSM.
Asymmetric keys cannot be used with the Custom Key Store option.
KMS
How to handle a key that could be compromised?
Temporarily disable keys so they cannot be used by anyone
Re-enable disabled keys if cleared
KMS
How is enveloped encryption done?
KMS creates a data key
data key is used to encrypt data
data key is encrypted with (plaintext) mater key
data key is stored alongside encrypted data
KMS
What is a customer managed KMS key?
A key created and stored in KMS, it differs from the AWS managed key, which is created by AWS and used for specific services
KMS
What type of keys can I import?
256-bit symmetric keys.
KMS
What’s the difference between a key I import and a key I generate in AWS KMS?
Keys generated by AWS KMS do not have an expiration time and cannot be deleted immediately; there is a mandatory 7 to 30 day wait period. All customer managed KMS keys, irrespective of whether the key material was imported, can be manually disabled or scheduled for deletion.
KMS
What keys can be rotated automatically?
KMS generated keys can be rotated once a year.
Automatic key rotation is not supported for imported keys, asymmetric keys, or keys generated in an AWS CloudHSM cluster using the AWS KMS custom key store feature.
KMS
What is the API call to get the public key of an asymmetric key?
GetPublicKey
KMS
How can data keys and data key pairs be exported out of the HSMs in plain text?
“GenerateDataKey” API or the “GenerateDataKeyWithoutPlaintext” API.
Asymmetric data key pairs: “GenerateDataKeyPair” API or the “GenerateDataKeypairWithoutPlaintext” API.
Step functions
What is a state machines / state?
State Machine: complete workflow
State: a single step in the workflow
Step functions
What is a Task?
Tasks perform work, either by coordinating another AWS service or an application that you can host basically anywhere
Step functions
What is a Pass state?
Pass their input as output to the next state.
Step functions
What are Parallel States?
Begin multiple branches of execution at the same time, such as running multiple Lambda functions at once.
Step functions
What is a Choice State?
Choice states add branching logic to your state machine, and make decisions based on their input.
Step functions
What is a state transition?
When you execute your state machine, each move from one state to the next is called a state transition.
Step Functions
In what language is a step function written?
Amazon States Language (JSON based)
Step Functions
What are the main differences between standard & express workflows?
Duration: 1 Year / 5min
Execution: once / at least once
Step Functions
How can the steps communicate with each other?
Apps can interact and update the stream via Step Function API.
Step Functions
What are the three types of steps?
sequential, branching or parallel steps.
X-Ray
What is necessary to run X-Ray on EC2 / On-premise?
Linux system must run the X-Ray daemon.
IAM instance role if EC2, other AWS credentials on on-premise instance.
X-Ray
What is necessary to run X-Ray on Lambda?
Make sure the X-Ray integration is ticked in the Lambda configuration (Lambda will run the daemon).
IAM role is the Lambda role.
X-Ray
What is necessary to run X-Ray on Elastic Beanstalk?
Set configuration in the Elastic Beanstalk console.
Or use the Beanstalk extension (.ebextensions/xray-daemon.config)
X-Ray
What is necessary to run X-Ray on ECS/EKS/Fargate?
Create a Docker image that runs the daemon or use the official X-Ray Docker image.
Ensure port mappings and network settings are correct and IAM task roles are defined.
KMS
What are the permissions needed to use a key?
"kms:Encrypt", "kms:Decrypt", "kms:ReEncrypt*", "kms:GenerateDataKey*", "kms:DescribeKey"
KMS
What does aws re-encrypt do?
It decrypts the file (in memory) and decrypts and saves it in a different file.
Useful when giving a different encryption key as before
KMS
Can you export a CMK?
A CMK can never be exported from KMS (CloudHSM allows this).
Dynamo DB
How can you change the primary key of an item?
A transaction will be used to delete and recreate the item with the new keys
Dynamo DB
What is the scope of a query?
In queries you an only retrieve data by searching for the primary key.
Additionally it is possible to add the sort key to further refine the result (if only looking for the primary key produces to many results)
Results can then be more specific by using a filter (color = blue e.g.)
Dynamo DB
What is the default order of a query result?
Ordered by the sort key in ascending order.
Can be changed setting ScanIndexForward to false
DynamoDB
What is the purpose for ScanIndexForward in scan results?
Nothing, ScanIndexForward is only used in queries
Dynamo DB
What is a parallel scan?
A scan only gets data in 1mb increments from one partition.
You can scan multiple partitions for a quicker result.
This should not be done if the DB is currently under load.
Dynamo DB
If there’s always lots of traffic, but you regularly need to scan the table, what can be done?
Create a second table and make the writes to both of them
Dynamo DB
What kind of caching strategy does DAX use?
Write through caching
DynamoDB writes to the cache and the (normal) table at the same time
Dynamo DB
How is a API Call with a DAX cache made?
Client is pointed to the Cache:
a) gets a Cache hit, item will be returned
b) gets a miss, DAX performs getItem (eventual consistent) against the table, stores the value in the cache and returns the result
Dynamo DB
How long is a stream saved?
24h
DynamoDB
Which policies does it support?
DynamoDB supports identity-based policies:
You can use a special IAM condition to restrict user access to only their own records.
DynamoDB doesn’t support resource-based policies.
DynamoDB
What is the ARN Format for an index?
arn:aws:dynamodb:[region]:[account]:[tableid]/[tablename]/index/[indexname]
Dynamo DB
When is DynamoDB creating new partitions?
DynamoDB allocates additional partitions to a table in the following situations:
- If you increase the table’s provisioned throughput settings beyond what the existing partitions can support.
- If an existing partition fills to capacity and more storage space is required.
DynamoDB
How is a partition key defined?
1) Value of the Partition key is input to an internal hash function which determines the partition or physical location on which the data is stored.
2) If you are using the Partition key as your Primary key, then no two items can have the same partition key.
Best practices for partition keys:
- Use high-cardinality attributes – e.g. e-mailid, employee_no, customerid, sessionid, orderid, and so on.
- Use composite attributes – e.g. customerid+productid+countrycode as the partition key and order_date as the sort key.
- Cache popular items – use DynamoDB accelerator (DAX) for caching reads.
- Add random numbers or digits from a predetermined range for write-heavy use cases – e.g. add a random suffix to an invoice number such as INV00023-04593
DynamoDB
What are the disadvantages of Strongly consistent reads
- A strongly consistent read might not be available if there is a network delay or outage. In this case, DynamoDB may return a server error (HTTP 500).
- Strongly consistent reads may have higher latency than eventually consistent reads.
- Strongly consistent reads are not supported on global secondary indexes.
- Strongly consistent reads use more throughput capacity than eventually consistent reads.
DynamoDB
What are the costs of Transactions?
There is no additional cost to enable transactions for DynamoDB tables.
DynamoDB performs two underlying reads or writes of every item in the transaction: one to prepare the transaction and one to commit the transaction.
DynamoDB
What can be done to optimize a scan?
- use ProjectionExpression to only return needed attributes
- use FilterExpression to filter out unwanted items (done after everything is retrieved)
- use parallel scans to get results faster (strains RCU consumption)
- Limit page size, if provisioned throughput is reached
- use eventual consistent reads (if possible)
DynamoDB
What data is affected when setting a TTL?
he TTL is enabled per row (you define a TTL column and add the expiry date / time there).
Deleted items are also deleted from the LSI / GSI.
DynamoDB
Define BatchWriteItem & BatchGetItem
Can put or delete up to 25 items in one call (max 16MB write / 400KB per item).
Up to 100 items, up to 16MB per item. Items are retrieved in parallel to minimize latency.
DynamoDB
What is Optimistic Locking
Optimistic locking is a strategy to ensure that the client-side item that you are updating (or deleting) is the same as the item in Amazon DynamoDB.
Protects database writes from being overwritten by the writes of others, and vice versa.
Lambda
How to directly call a version / alias?
arn: aws:lambda:REGION:ID:function:[FUNCTIONNAME]:[VERSION]
arn: aws:lambda:REGION:ID:function:[FUNCTIONNAME]:[ALIAS]
Lambda
What is the default Concurrent Execution limit and what happens if it is hit?
Default is 1000 (per Region)
Request will return 429: TooManyRequestsException
Lambda
How to invoke Lambda asynchronously?
–invocation-type Event
Lambda
What are the Status Codes returned by synch. / asynch. invocations?
Synchronous: 200, 503 etc.
Asynchronous: 202
Lambda
What are the services that Lambda dan read events from?
Amazon Kinesis
Amazon DynamoDB
Amazon Simple Queue Service
Lambda
What is “Event Source Mapping”?
In order to react to events in a service (such as messages in a SQS queue) Lambda needs three configurations:
- permissions
- event structure settings
- polling behavior
Lambda
Which “items” are mutable / immutable?
$LATEST is mutable (changeable)
Versions are immutable
Aliases are mutable.
Lambda
What are aliases?
- Mutable versions of a function
- can be used for environments etc. w/o the knowledge which version they are assigned to
- Aliases enable blue / green deployment by assigning weights to Lambda version (doesn’t work for $LATEST, you need to create an alias for $LATEST).
Lambda
What is the throttling behavior for (a)sync requests?
Throttle behavior:
For synchronous invocations returns throttle error 429.
For asynchronous invocations retries automatically (twice) then goes to a Dead Letter Queue (DLQ).
Lambda
What is reserved concurrency?
Guarantee of concurrent executions.
Sum of reserved concurrency cannot exceed max. concurrency - 100
Can be used to limit - for example to ensure that the database can handle the traffic
Lambda
What can be done to ensure that the lambda can handle sudden traffic spikes?
Use Provisioned Concurrency to the amount of traffic that can be expected (additional costs occur)
Lambda
Which role should be attached if the function will be placed in a VPC?
AWSLambdaVPCAccessExecutionRole
Lambda
What needs to be done in order to enable a web socket load balancing with an ALB and Lambda?
WebSockets are not supported from a Load Balancer, but it is supported by API Gateway
Lambda
What are the limits of lambda?
Memory allocation 128MB – 3008MB in 64MB increments.
Maximum execution time is 15 minutes (900 seconds).
Size of environment variables maximum 4KB.
Disk capacity in the “function container” (/tmp) is 512 MB.
Invocation payload:
- Synchronous 6 MB.
- Asynchronous 256 KB
Lambda function deployment size is 50 MB (zipped), 250 MB unzipped.
API Gateway
Can you create HTTP Endpoints (without encryption)?
All of the APIs created with Amazon API Gateway expose HTTPS endpoints only (does not support unencrypted endpoints).
API Gateway
What are the differences between Edge-Optimized , Regional & Private Endpoints?
Regional Endpoint
- for clients in the same region.
Edge-Optimized Endpoint
- is best for geographically distributed clients. API requests are routed to the nearest CloudFront Point of Presence (POP).
- Edge-optimized APIs capitalize the names of HTTP headers (for example, Cookie).
- CloudFront sorts HTTP cookies in natural order by cookie name before forwarding the request to your origin
Private Endpoint
- A private API endpoint is an API endpoint that can only be accessed from your Amazon Virtual Private Cloud (VPC) using an interface VPC endpoint,
API Gateway
What is the configuration chain for requests?
Method Request -> Integration Request -> Integration Response -> Method Response
API Gateway
What are the use cases for stage variables?
- Configure HTTP endpoints your stages talk to (dev, test, prod etc.).
- Pass configuration parameters to AWS Lambda through mapping templates.
- You can create a stage variable to indicate the corresponding Lambda alias.
API Gateway
What can be done with Mapping templates?
Uses Velocity Template Language (VTL).
Mapping templates can be used to modify request / responses.
- Rename parameters.
- Modify body content.
- Add headers.
- Map JSON to XML for sending to backend or back to client.
- Filter output results (remove unnecessary data).-
API Gateway
What are the throttling limits?
If you go over 10,000 requests per second or 5,000 concurrent requests you will receive a 429 Too Many Requests error response.
Kinesis
What is the difference between KCL and the Kinesis Data Streams API?
The KCL is different from the Kinesis Data Streams API that is available in the AWS SDKs.
The Kinesis Data Streams API helps you manage many aspects of Kinesis Data Streams (including creating streams, resharding, and putting and getting records).
The KCL provides a layer of abstraction specifically for processing data in a consumer role.
Kinesis
What should be considered for the amount of KCLs and Shards?
You never need multiple instances to handle the processing of one shard.
However, one worker can process multiple shards.
Example:
4 shards = max 4 KCL instances.
Kinesis
On which platforms does the KCL run?
KCL can run on EC2, Elastic Beanstalk, and on-premises servers.
Kinesis
What are the main differences between SQS and Kinesis?
SQS:
- Data is deleted after being consumed.
- No need to provision throughput.
- No ordering guarantee (except with FIFO queues).
Kinesis:
- Possible to replay data.
- Ordering at the shard level.
- Must provision throughput.