misc Flashcards
How does Kinesis determine the shard for a newly inserted record?
Using the MD5 hash of a user-specifified partition key
From the Kinesis Developer Guide (http://docs.aws.amazon.com/kinesis/latest/dev/key-concepts.html):
“A partition key is used to group data by shard within a stream. Amazon Kinesis segregates the data records belonging to a stream into multiple shards, using the partition key associated with each data record to determine which shard a given data record belongs to.
Partition keys are Unicode strings with a maximum length limit of 256 bytes. An MD5 hash function is used to map partition keys to 128-bit integer values and to map associated data records to shards. A partition key is specified by the applications putting the data into a stream.”
In the context of IAM identity federation, what is an Identity Broker?
A custom application that authenticates against an identity store and provides access to AWS
From the IAM User Guide ():
“””
To enable your organization’s users to access the AWS Management Console, you can create a custom “identity broker” that performs the following steps:
1. Verify that the user is authenticated by your local identity system.
2. Call the AWS Security Token Service (AWS STS) AssumeRole (recommended) or GetFederationToken APIs to obtain temporary
security credentials for the user. The credentials are associated with permissions that control what the user can do.
3. Call an AWS federation endpoint and supply the temporary security credentials to get a sign-in token.
4. Construct a URL for the console that includes the token.
5. Give the URL to the user or invoke the URL on the user’s behalf.
“””
This blog goes into a lot more
detail of how to do this with GetFederationToken, including some nice pictures and examples. (https://aws.amazon.com/blogs/aws/aws-identity-and-access-management-now-with-identity-federation/)
Does increasing an RDS instance’s storage cause downtime?
No
From the RDS User Guide (http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_PIOPS.StorageTypes.html):
“Data storage in Amazon RDS is specified by selecting a storage type and providing a storage size (GB) when you create or modify a DB instance. You can change the type of storage your instance uses by modifying the DB instance, but changing the type of storage in some cases might result in a short outage for the instance. Changing from Magnetic to either General Purpose (SSD) or Provisioned
IOPS (SSD) results in an outage. Also, changing from General Purpose (SSD) or Provisioned IOPS (SSD) to Magnetic results in an outage. The outage time is typically 60–120 seconds. For more information about Amazon RDS storage types, see Amazon RDS
Storage Types.
Increasing the allocated storage does not result in an outage. Note that you cannot reduce the amount of storage once it has been allocated. The only way to reduce the amount of storage allocated to a DB instance is to dump the data out of the DB instance, create a new DB instance with less storage space, and then load the data into the new DB instance.”
What is the largest possible size for a VPC?
- /14
- /16
- /24
/16 (65,536 addresses)
From the VPC documentation (http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Subnets.html#VPC_Sizing):
“VPC Sizing
You can assign a single CIDR block to a VPC. The allowed block size is between a /28 netmask and/16 netmask. In other words, the VPC can contain from 16 to 65,536 IP addresses. You can’t change the size of a VPC after you create it. If your VPC is too small to meet your needs, create a new, larger VPC, and then migrate your instances to the new VPC. To do this, create AMIs from your running instances, and then launch replacement instances in your (new, larger VPC. You can then terminate your old instances, and delete your smaller VPC. For more information, see Deleting Your VPC.”
What are the default permissions for a VPCs default NACL?
- Deny all
- Allow all inbound and outbound
Allow all inbound and outbound
From the VPC documentation (http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_ACLs.html#default-network-acl):
“Default Network ACL
To help you understand what ACL rules look like, here’s what the default network ACL looks like in its initial state. It is configured to allow all traffic to flow in and out of each subnet. Each network ACL includes a rule whose rule number is an asterisk. This rule ensures that if a packet doesn’t match any of the other rules, it’s denied. You can’t modify or remove this rule.”
also:
“Your VPC automatically comes with a modifiable default network ACL; by default, it allows all inbound and outbound traffic.”
Route 53
How quickly do DNS changes propagate globally?
- Within 60 seconds
- Within 120 seconds
- Within 300 seconds
Within 60 seconds
From the Route 53 FAQ (https://aws.amazon.com/route53/faqs/):
“Q. How quickly will changes I make to my DNS settings on Amazon Route 53 propagate globally?
Amazon Route 53 is designed to propagate updates you make to your DNS records to its world-wide network of authoritative DNS servers within 60 seconds under normal conditions. A change is successfully propagated world-wide when the API call returns an INSYNC status listing. Note that caching DNS resolvers are outside the control of the Amazon Route 53 service and will cache your resource record sets according to their time to live (TTL). The INSYNC or PENDING status of a change refers only to the state of Route 53’s authoritative DNS servers.”
Is all data in Glacier encrypted by default?
Yes
From Glacier FAQ (https://aws.amazon.com/glacier/faqs/)
“Yes, all data in the service will be encrypted on the server side. Amazon Glacier handles key management and key protection for you. Amazon Glacier uses one of the strongest block ciphers available, 256-bit Advanced Encryption Standard (AES-256). 256-bit is the largest key size defined for AES. Customers wishing to manage their own keys can encrypt data prior to uploading it.”
What are the initial rules for a VPC’s default security group?
Allow all outbound traffic
Allow all inbound traffic from other instances in the security group
From the EC2 Security Group documentation (http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_SecurityGroups.html):
“””
Default Security Groups
Your VPC automatically comes with a default security group. Each EC2 instance that you launch in your VPC is automatically associated with the default security group if you don’t specify a different security group when you launch the instance.
The following table describes the default rules for a default security group.
You can change the rules for the default security group.
You can’t delete a default security group.
“””
What can be modified on an existing Reserved Instance?
Availability Zone
Switching between EC2-VPC and EC2-Classic
Changing the instance type within the same instance family
From the Reserved Instance documentation (http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ri-modifying.html):
“When your computing needs change, you can modify your Reserved Instances and continue to benefit from your capacity reservation.
Modification does not change the remaining term of your Reserved Instances; their end dates remain the same. There is no fee, and you do not receive any new bills or invoices. Modification is separate from purchasing and does not affect how you use, purchase, or sell Reserved Instances. You can modify your whole reservation, or just a subset, in one or more of the following ways:
* Switch Availability Zones within the same region
* Change between EC2-VPC and EC2-Classic
* Change the instance type within the same instance family”
How long does it take to retrieve a tape from a Virtual Tape Shelf into a Virtual Tape Library?
- 12 hours
- 24 hours
24 hours
From the Storage Gateway documentation (http://docs.aws.amazon.com/storagegateway/latest/userguide/storage-gateway-vtl-concepts.html):
“Retrieving tapes – Tapes archived to the VTS cannot be read directly. To read an archived tape, you must first retrieve it to your gateway-VTL either by using the AWS Storage Gateway console or by using the AWS Storage Gateway API. A retrieved tape will be available in your VTL in about 24 hours.”
How long does it take for Route 53 to execute a DNS failover?
- Under one minute
- Under two minutes
- Under five minutes
Under two minutes
From this re:Invent presentation (https://www.youtube.com/watch?v=f9y-T7mQVxs):
The top bar represents the time to respond a failover “manually” by personally reacting to a CloudWatch alarm and reconfiguring Route
53 and other components in the best case.
The second bar represents how long it takes for Route 53 itself to execute a DNS failover using the native feature.
Does DynamoDB background maintenance consume burst capacity?
Yes
From the Amazon S3 documentation
(http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GuidelinesForTables.html#GuidelinesForTables.Bursting):
“DynamoDB provides some flexibility in the per-partition throughput provisioning: When you are not fully utilizing a partition’s throughput, DynamoDB reserves a portion of your unused capacity for later “bursts” of throughput usage. DynamoDB currently
reserves up 5 minutes (300 seconds) of unused read and write capacity. During an occasional burst of read or write activity, this reserved throughput can be consumed very quickly — even faster than the per-second provisioned throughput capacity that you’ve defined for your table. However, do not design your application so that it depends on burst capacity being available at all times:
DynamoDB can and does use burst capacity for background maintenance and other tasks without prior notice.”
What is the total amount of data that can be stored in a single Gateway-Stored gateway appliance (in multiple volumes):
- 262 TB
- 192 TB
192 TB
Single Volume 16TB
From the Storage Gateway FAQ (https://aws.amazon.com/storagegateway/faqs/):
“Q. How much volume data can I manage per gateway?
…
Each Gateway-Stored gateway can support up to 12 volumes for a maximum of 192 TB of data (12 volumes, each 16 TB in size).”
To which entities can I assign an IAM policy?
- Role
- Group
- User
All of these entities can be assigned a policy.
Further, you can also assign user policies indirectly via group memberships.
What is the provisioned read capacity of a Kinesis shard?
- 5 TPS up to 2MB/s
- 10 TPS up 10MB/s
5 TPS up to 2MB/s
From the Kinesis Developer Guide (http://docs.aws.amazon.com/kinesis/latest/dev/key-concepts.html):
“A shard is a uniquely identified group of data records in an Amazon Kinesis stream. A stream is composed of multiple shards, each of which provides a fixed unit of capacity. Each shard can support up to 5 transactions per second for reads, up to a maximum total data read rate of 2 MB per second and up to 1,000 records per second for writes, up to a maximum total data write rate of 1 MB per second (including partition keys). The data capacity of your stream is a function of the number of shards that you specify for the stream.
The total capacity of the stream is the sum of the capacities of its shards.”
When will ElastiCache automatically upgrade a Memcached cluster?
- whenever a new version is released
- to address security vulnerabilities
to address security vulnerabilities
From the ElastiCache FAQ (https://aws.amazon.com/elasticache/faqs/):
“Q: Can I control if and when the engine version powering Amazon ElastiCache Cluster is upgraded to new supported versions?
Amazon ElastiCache allows you to control if and when the Memcached protocol-compliant software powering your Cache Cluster is upgraded to new versions supported by Amazon ElastiCache. This provides you with the flexibility to maintain compatibility with specific Memcached versions, test new versions with your application before deploying in production, and perform version upgrades on your own terms and timelines. Version upgrades involve some compatibility risk, thus they will not occur automatically and must be initiated by you. This approach to cache software patching puts you in the driver’s seat of version upgrades, but still offloads the work of patch application to Amazon ElastiCache. You can learn more about version management by reading the FAQs that follow. Alternatively, you can refer to the Amazon ElastiCache User Guide. While Cache Engine Version Management functionality is intended to give you
as much control as possible over how patching occurs, we may patch your Cache Cluster on your behalf if we determine there is any security vulnerability in the system or cache software.”
What is an IAM External ID and how is it used?
An identifier that an AWS managed service provides when assuming a role in its customers’ accounts
From the IAM documentation (http://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_create_for-user_externalid.html):
“At times, you need to give a third party access to your AWS resources (delegate access). One important aspect of this scenario is the External ID, an optional piece of information that you can use in an IAM role trust policy to designate who can assume the role.
…
In abstract terms, the external ID allows the user that is assuming the role to assert the circumstances in which they are operating. It also provides a way for the account owner to permit the role to be assumed only under specific circumstances. The primary function of the external ID is to address and prevent the “confused deputy” problem.
What is the default retention period for RDS backups?
- 7 days
- 14 days
- 1 day
1 day
From the RDS User Guide (http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_WorkingWithAutomatedBackups.html):
“Amazon RDS can automatically back up all of your DB instances. You can set the backup retention period when you create a DB instance. If you don’t set the backup retention period, Amazon RDS uses a default period retention period of one day. “
What data is an IAM Request Context?
Calling principal Environment data (IP address, user agent, etc.) Resource data (e.g., DynamoDB table name)
From the IAM documentation (http://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_evaluation-logic.html):
“The Request Context
When AWS authorizes a request, information about the request is assembled from several sources:
- Principal (the requester), which is determined based on the secret access key. This might represent the root user, an IAM user, a federated user (via STS), or an assumed role, and includes the aggregate permissions that are associated with that principal.
- Environment data, such as the IP address, user agent, SSL enabled, the time of day, etc. This information is determined from the
request. - Resource data, which pertains to information that is part of the resource being requested. This can include information such as a DynamoDB table name, a tag on an Amazon EC2 instance, etc.
This information is gathered into a request context, which is a collection of information that’s derived from the request. During evaluation, AWS uses values from the request context to determine whether to allow or deny the request. For example, does the action in the request context match an action in the Action element? If not, the request is denied. Similarly, does the resource in the request
context match one of the resources in the Resource element? If not, the request is denied.
Which STS APIs can be called by users that do not have AWS root or IAM credentials?
AssumeRoleWithSAML
AssumeRoleWithWebIdentity
What conditions trigger an automated failover of a multi-AZ RDS instance?
An Availability Zone outage
Failure of the primary DB instance
Change of the DB instance’s server type
Patching the DB instance’s operating system
Manual failover initiated using “Reboot with Failover”
From the RDS User Guide (http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.MultiAZ.html):
“””
Amazon RDS handles failovers automatically so you can resume database operations as quickly as possible without administrative intervention. The primary DB instance switches over automatically to the standby replica if any of the following conditions occur:
An Availability Zone outage
The primary DB instance fails
The DB instance’s [storage] type is changed
The operating system of the DB instance is undergoing software patching
A manual failover of the DB instance was initiated using Reboot with failover
“””
What is the smallest possible size for a VPC?
- /20
- /24
- /28
/28 (16 addresses)
From the VPC documentation (http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Subnets.html#VPC_Sizing):
“VPC Sizing
You can assign a single CIDR block to a VPC. The allowed block size is between a /28 netmask and/16 netmask. In other words, the VPC can contain from 16 to 65,536 IP addresses. You can’t change the size of a VPC after you create it. If your VPC is too small to meet your needs, create a new, larger VPC, and then migrate your instances to the new VPC. To do this, create AMIs from your running instances, and then launch replacement instances in your new, larger VPC. You can then terminate your old instances, and delete your smaller VPC. For more information, see Deleting Your VPC.”
Is it possible to reduce the storage of an RDS instance?
No
From the RDS User Guide (http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_PIOPS.StorageTypes.html):
“Data storage in Amazon RDS is specified by selecting a storage type and providing a storage size (GB) when you create or modify a DB instance. You can change the type of storage your instance uses by modifying the DB instance, but changing the type of storage in some cases might result in a short outage for the instance. Changing from Magnetic to either General Purpose (SSD) or Provisioned IOPS (SSD) results in an outage. Also, changing from General Purpose (SSD) or Provisioned IOPS (SSD) to Magnetic results in an outage. The outage time is typically 60–120 seconds. For more information about Amazon RDS storage types, see Amazon RDS
Storage Types.
Increasing the allocated storage does not result in an outage. Note that you cannot reduce the amount of storage once it has been allocated. The only way to reduce the amount of storage allocated to a DB instance is to dump the data out of the DB
instance, create a new DB instance with less storage space, and then load the data into the new DB instance.”
Is it possible that clients might see a message on an SQS queue even after the message has been deleted?
Yes
From the SQS FAQ (https://aws.amazon.com/sqs/faqs/):
“Q: Can a deleted message be received again?
Yes, under rare circumstances you might receive a previously deleted message again. This can occur in the rare situation in which a DeleteMessage operation doesn’t delete all copies of a message because one of the servers in the distributed Amazon SQS system isn’t available at the time of the deletion. That message copy can then be delivered again. You should design your application so that
no errors or inconsistencies occur if you receive a deleted message again. “
What are the valid Auto Scaling custom termination policies supported by AWS?
OldestInstance NewestInstance OldestLaunchConfiguration ClosestToNextInstanceHour Default
From the Auto Scaling documentation
()http://docs.aws.amazon.com/AutoScaling/latest/DeveloperGuide/AutoScalingBehavior.InstanceTermination.html):
“Auto Scaling currently supports the following custom termination policies:
* OldestInstance. Auto Scaling terminates the oldest instance in the group. This option is useful when you’re upgrading the instances in the Auto Scaling group to a new EC2 instance type, and want to eventually replace instances with older instances with newer ones.
* NewestInstance. Auto Scaling terminates the newest instance in the group. This policy is useful when you’re testing a new launch configuration but don’t want to keep it in production.
* OldestLaunchConfiguration. Auto Scaling terminates instances that have the oldest launch configuration. This policy is useful when you’re updating a group and phasing out the instances from a previous configuration.
* ClosestToNextInstanceHour. Auto Scaling terminates instances that are closest to the next billing hour. This policy helps you maximize the use of your instances and manage costs.
* Default. Auto Scaling uses its default termination policy. This policy is useful when you have more than one scaling policy associated with the group.”
What are some DynamoDB best practices that can improve performance?
- Concatenate query attributes into a single LSI (e.g., if you need to query on status and date, create a single range key with status + date
- Split tables by access frequency (by projecting those specific attributes into a GSI) to reduce query IOPS
- Ensure that keys are evenly distributed across partitions
- Shard writes of extremely hot tables by spreading the items across a fixed number of shards and appending a random shard identifier (e.g., an integer from 1 to 10) to an item’s hash key for each write; aggregate reads across multiple shards
- Move less frequently access items into a separate table with lower provisioned I/O
- Cache read-heady items
Lots of detail in this presentation (https://www.youtube.com/watch?v=KmHGrONoif4)
For which workloads would you choose Redshift over RDS?
Analytics and reporting
Workloads with very large data sets
Workloads where analytics can’t interfere with OLTP
From the Redshift FAQ (https://aws.amazon.com/redshift/faqs/):
“Q: When would I use Amazon Redshift vs. Amazon RDS?
Both Amazon Redshift and Amazon RDS enable you to run traditional relational databases such as MySQL, Oracle and SQL Server in the cloud while offloading database administration. Customers use Amazon RDS databases both for online-transaction processing (OLTP) and for reporting and analysis. Amazon Redshift harnesses the scale and resources of multiple nodes and uses a variety of
optimizations to provide order of magnitude improvements over traditional databases for analytic and reporting workloads against very
large data sets. Amazon Redshift provides an excellent scale-out option as your data and query complexity grows or if you want to prevent your reporting and analytic processing from interfering with the performance of your OLTP workload.”