Study Guide Flashcards
Why do customers move to AWS?
Customers move to AWS to increase agility.
- Accelerate time to market – By spending less time acquiring and managing infrastructure, you can focus on developing features that deliver value to your customers.
- Increase innovation – You can speed up your digital transformation by using AWS, which provides tools to more easily access the latest technologies and best practices. For example, you can use AWS to develop automations, adopt containerization, and use machine learning.
- Scale seamlessly – You can provision additional resources to support new features and scale existing resources up or down to match demand.
Customers also move to AWS to reduce complexity and risk.
- Optimize costs – You can reduce costs by paying for only what you use. Instead of paying for on-premises hardware, which you might not use at full capacity, you can pay for compute resources only while you’re using them.
- Minimize security vulnerabilities – Moving to AWS puts your applications and data behind the advanced physical security of the AWS data centers. With AWS, you have many tools to manage access to your resources.
- Reduce management complexity – Using AWS services can reduce the need to maintain physical data centers, perform hardware maintenance, and manage physical infrastructure.
What are key test concerns for Lambda?
Lambda is lightest workload; but has limitations as it can’t run more than 15 minutes or 10 Gigs; If Lambda is in the answer, then check the question for any time limitations
A group of one or more data centers is called _________?
an Availability Zone.
An Availability Zone is one or more discrete data centers with redundant power, networking, and connectivity in an AWS Region.
What 4 factors do you use to determine the right Region for your services, applications, and data?
Governance and legal requirements – Consider any legal requirements based on data governance, sovereignty, or privacy laws.
Latency – Close proximity to customers means better performance.
Service availability – Not all AWS services are available in all Regions.
Cost – Different Regions have different costs. Research the pricing for the services that you plan to use and compare costs to make the best decision for your workloads.
When should you consider using Local Zones?
You can use AWS Local Zones for highly demanding applications that require single-digit millisecond latency to end users. Examples include:
Media and entertainment content creation – Includes live production, video editing, and graphics-intensive virtual workstations for artists in geographic proximity
Real-time multiplayer gaming – Includes real-time multiplayer game sessions, to maintain a reliable gameplay experience
Machine learning hosting and training – For high-performance, low latency inferencing
Augmented reality (AR) and virtual reality (VR) – Includes immersive entertainment, data driven insights, and engaging virtual training experiences
NOTE: exam, if low latency is the driver, then local zone might be the best option; deploying subnets close to resources
What are edge locations used for?
Edge locations are in major cities around the world. They receive requests and cache copies of your content for faster delivery.
To deliver content to end users with lower latency, you use a global network of edge locations that support AWS services. CloudFront delivers customer content through a worldwide network of point of presence (PoP) locations, which consists of edge locations and Regional edge cache servers.
Regional edge caches, used by default with CloudFront, are used when you have content that is not accessed frequently enough to remain in an edge location. Regional edge caches absorb this content and provide an alternative to the need to retrieve that content from the origin server.
exam – also edge locations are associated with caching while local zones have some storage, compute, db, etc.; improves performance… such as caching content
One common use for edge locations is to ___________________.
serve content closer to your customers
exam – any question mentioning caching implies edge location; watch for keywords on the exam!!! use local for low latency/millisecond access
The ______________ helps cloud architects build secure, high-performing, resilient, and efficient application infrastructures.
AWS Well-Architected Framework
With the tool, you can gather data and get recommendations to:
* Minimize system failures and operational costs.
* Dive deep into business and infrastructure processes.
* Provide best practice guidance.
* Deliver on the cloud computing value proposition.
What are the 6 well architected framework pillars?
- Security – Use AWS security best practices to build policies and processes to protect data and assets. Allow auditing and traceability. Monitor, alert, and audit actions and changes to your environment in real time.
- Cost optimization – Achieve cost efficiency while considering fluctuating resource needs.
- Reliability – Meet well-defined operational thresholds for applications. This includes support to recover from failures, handling increased demand, and mitigating disruption.
- Performance efficiency – Deliver efficient performance for a set of resources like instances, storage, databases, space, and time.
- Operational excellence – Run and monitor systems that deliver business value. Continually improve supporting processes and procedures.
- Sustainability – Minimize and understand your environmental impact when running cloud workload
As a best practice, what should you require for your root user?
- multi-factor authentication (MFA)
- set up an admin user that you normally use
___________ is a web service that helps you securely control access to AWS resources.
And what is it used for?
AWS Identity and Access Management (IAM)
Use IAM to control who is authenticated (signed in) and authorized (has permissions)
exam - IAM users sign in with credentials and permissions… not email(?)
A ___________ is an entity that can request an action or operation on an AWS resource
principal
Exam: users & principals don’t have any privileges by default; also, best to grant permissions to groups and assign users to groups; IAM roles for short lived needs
Exam - if you see “temporary permissions” then it’s a ROLE
Exam- set up users for “long term” needs
With IAM, Each user has their own ___________.
credentials
NOTE: by default, no access until granted
Programmatic access gives your IAM user the credentials to make API calls in the AWS CLI or AWS SDKs. AWS provides an SDK for programming languages such as Java, Python, and .NET.
When programmatic access is granted to your IAM user, it creates _______________________ ?
a unique key pair that comprises an access key ID and a secret access key. Use your key pair to configure the AWS CLI, or make API calls through an AWS SDK.
An IAM _____________ is a collection of IAM users.
An IAM user group
NOTE: minimizes admin load; cumulative effect with privileges: A user can be a member of more than one user group. Example, Richard is a member of the Analysts group and the Billing group. Richard gets permissions from both IAM user groups.
IAM _________ deliver temporary AWS credentials.
roles; Use roles to delegate access to users, applications, or services that don’t normally have access to your AWS resources.
Exam - Roles are temporary; when a user assumes a role, they only have the permissions that are granted to the role and do not follow their group’s inherited permissions.
What is used to give roles access to resources?
IAM Policy assignments
_______ are attached to an identity or resource to define its permissions. AWS evaluates these when a principal, such as a user, makes a request.
policies
What are the 4 security policy types?
Policy types
* Identity-based policies – Attach managed and inline policies to IAM identities. These identities include users, groups to which users belong, and roles.
* Resource-based policies – Attach inline policies to resources. The most common examples of resource-based policies are Amazon S3 bucket policies and IAM role trust policies.
* AWS Organizations service control policies (SCPs) – Use Organizations SCPs to define the maximum permissions for account members of an organization or organizational unit (OU).
* IAM permissions boundaries – AWS supports permissions boundaries for IAM entities (users or roles). Use IAM permissions boundaries to set the maximum permissions that an IAM entity can receive
Exam - know the difference … resources refers to AWS service policies; permissions boundaries are guardrails; don’t have permissions to SCPs; to give access you Grant permissions (IAM identity-based policy & IAM resource-based policy); you set maximum permissions through IAM permission boundaries and AWS org service control policies (SCPs)
_____________ policies are JSON permissions policy documents that control:
* Which actions an IAM identity (users, groups of users, and roles) can perform
* On which resources they can perform these actions
* Under what conditions they can perform these actions
Identity-based
Exam: know permission boundaries; know roles are for valid short live credentials; have 2 options: identity & resources
When granting permissions:
- Identity-based policies are assigned to users, groups, and roles.
- Resource-based policies are assigned to resources.
NOTE:
* Resource-based policies are checked when someone tries to access the resource.
Given the following Identity-based policy example, what access would you have?
you can attach the example policy statement to your IAM user
Exam - Policy “EAR” to remember, Effect-Action-Resource … know the resource can be a bucket or ???… know how to recognize the EAR from JSON for the exam but don’t need to write JSON
Then, that user is allowed to stop and start EC2 instances in your account if the condition is met. Here, the EC2 instances that your IAM user can control must have a tag with key Owner and value equal to the IAM user name.
In the Resource element, the policy lists an Amazon Resource Name (ARN) with a wildcard (asterisk) character. Wildcards are used to apply a policy element to more than one resource or action. This policy applies for resources in any account number and Region with any resource ID. It can be reused in multiple accounts without having to rewrite the policy with your AWS account ID.
How are IAM policies evaluated?
AWS evaluates all policies that are applicable to the request context. The following list summarizes the AWS evaluation logic for policies within a single account:
* By default, all requests are implicitly denied with the exception of the AWS account root user, which has full access. This policy is called an implicit deny.
* An explicit allow in an identity-based policy or resource-based policy overrides this default. There are additional security controls that can override an explicit allow with an implicit deny, such as permissions boundaries and SCPs.
* An explicit deny in any policy overrides any allows
___________ is a strategy that is focused on creating multiple layers of security.
Defense in depth
Apply a defense-in-depth approach with multiple security controls to all layers.
A ______________ is an advanced feature for using a managed policy to set the maximum permissions that an identity-based policy can grant to an IAM entity and act as a filter.
permissions boundary
NOTE: explicitly grant to prevent implicit deny
AWS supports permissions boundaries for which IAM entities?
users or roles.
What are several reasons that you might want to create a multi-account structure in your organization?
NOTE: Improves overhead and simplified billing
- To group resources for categorization and discovery
- To improve your security posture with a logical boundary
- To limit potential impact in case of unauthorized access
- To simplify management of user access to different environments
What are benefits of using AWS Organizations?
AWS Organizations provides these key features:
* Centralized management of all your AWS accounts
* Consolidated billing for all member accounts
helps with consolidating the costs to get discounts
Create a hierarchy by grouping accounts into organizational units (OUs). Apply service control policies (SCPs) to control maximum permissions in every account under an organization unit (OU).
NOTE: anywhere on the exam that an option is manual work, then it’s incorrect. always go with automation
_______ is a type of organization policy that you can use to manage permissions in your organization. In essence, a guardrail for your org.
SCP (service control policy)
Attaching an SCP to an Organizations entity (root, OU, or account) defines a guardrail. SCPs set limits on the actions that the IAM users and roles in the affected accounts can perform. To grant permissions, you must attach identity-based or resource-based policies to IAM users, or to the resources in your organization’s accounts. When an IAM user or role belongs to an account that is a member of an organization, the SCPs limit the user’s or role’s effective permissions.
SCP doesn’t grant but does control
Keyword: guardrail
Are all IPv6 addresses public only?
yes
What is the smallest supported CIDR? & why?
/28 because you lose 5 from each subnet
Can a VPC be in more than one region? Can a subnet be in more than one AZ?
No
No
What are the 5 reserved IPs that you can’t use from every subnet that you create?
The first four IP addresses and the last IP address in each subnet CIDR block are not available and cannot be assigned to an instance. For example, in a subnet with CIDR block 10.0.0.0/24, the following five IP addresses are reserved:
* 10.0.0.0: Network address.
* 10.0.0.1: Reserved by AWS for the VPC router.
* 10.0.0.2: Reserved by AWS. The IP address of the DNS server is always the base of the VPC network range plus 2.
* 10.0.0.3: Reserved by AWS for future use.
* 10.0.0.255: Network broadcast address. AWS does not support broadcast in a VPC; therefore, we reserve this address.
NOTE: Consider larger subnets over smaller ones (/24 and larger). You are less likely to waste or run out of IPs if you distribute your workload into larger subnets.
What are the 3 required items for each public subnet?
A public subnet requires the following:
- Internet gateway: The internet gateway allows communication between resources in your VPC and the internet.
- Route table: A route table contains a set of rules (routes) that are used to determine where network traffic is directed. It can direct traffic to the internet gateway.
- Public IP addresses
Note associate a route table with a subnet; a router table can be associated with multiple subnets but subnets only with one route table
T/F: IGW can only be associated with one VPC at a time
T
When you create a VPC, it automatically has a ______________.
main route table
every route table is associated with a VPC mapped to a local VPC route; every table has a local route that can’t be deleted providing inter VPC connectivity as a result
This local route permits communication for all the resources within the VPC. You can’t modify the local route in a route table
A subnet can be associated with only one route table at a time, but you can associate multiple subnets with the same route table. Use custom route tables for each subnet to permit granular routing for destinations.
Each AWS account comes with a default Amazon VPC
______________ is a static, public IPv4 address that is designed for dynamic cloud computing.
Elastic IP address
You are limited to five Elastic IP addresses. To help conserve them, you can use a NAT device. We encourage you to use an Elastic IP address primarily to be able to remap the address to another instance in the case of instance failure.
You can associate a/an _________ with any instance or network interface for any VPC in your account.
Elastic IP address
An Elastic IP address is a static, public IPv4 address designed for dynamic cloud computing.
You can associate an Elastic IP address with an instance by updating the network interface attached to the instance. The advantage of associating the Elastic IP address with the network interface instead of directly with the instance is that you can move all the attributes of the network interface from one instance to another in a single step.
With a/an ____________, you can mask the failure of an instance by rapidly remapping the address to another instance in your VPC.
Elastic IP address
You can move an Elastic IP address from one instance to another. The instance can be in the same VPC or another VPC. An Elastic IP address is accessed through the internet gateway of a VPC. If you set up a VPN connection between your VPC and your network, the VPN traffic traverses a virtual private gateway, not an internet gateway. Therefore, it cannot access the Elastic IP address.
________________ is a logical networking component in a VPC that represents a virtual network card
elastic network interface
When moved to a new instance, the network interface maintains its public and Elastic IP address, private IP and Elastic IP address, and MAC address. The attributes of a network interface follow it.
When you move a network interface from one instance to another, network traffic is redirected to the new instance. Each instance in a VPC has a default network interface (the primary network interface).
What port does a bastion host interface on?
bastion host associated with port 22
______________ communicate between instances in your VPC and the internet. They are horizontally scaled, redundant, and highly available by default and, provide a target in your subnet route tables for internet-routable traffic.
NAT gateways
NAT gateways come in 2 versions now: public and private; use private to inter-VPC or to connect to on-premise but can’t get to the Internet
You can use _____________ for a one-way connection between private subnet instances and the internet or other AWS services. This type of connection prevents external traffic from connecting with your private instances.
a NAT gateway
exam: need to get rid of a SPOF, so use multiple instances of NAT gws
Regarding VPCs and HA, what’s a key consideration?
Deploying a VPC across multiple Availability Zones creates an architecture that achieves high availability
___________ receives inbound traffic and routes it to the application servers in the private subnets of both Availability Zones.
Elastic Load Balancing
LBs can be used for internal and external facing workloads
___________ is an optional layer of security for your VPC that acts as a firewall for controlling traffic in and out of one or more subnets. Every VPC automatically comes with a one allowing all inbound and outbound IPv4 traffic.
network ACL
You can create a custom network ACL and associate it with a subnet. By default, custom network ACLs deny all inbound and outbound traffic until you add rules.
exam - ports of interest are 80, 443, & 22 (SSH)
Firewall at the subnet level with default deny alls; a subnet may only have one nacl but nacls can be associated with multiple subnets; stateless so need return entry to allow traffic
_______________ acts as a virtual firewall for your instance to control inbound and outbound traffic.
A security group
The default group allows inbound communication from other members of the same group and outbound communication to any destination. Traffic can be restricted by any IP protocol, by service port, and by source or destination IP address (individual IP address or CIDR block).
exam – SGs are on the boundary of the instance; instance level firewalls; no numerical priority and no default deny
_______________ act at the network interface level, not the subnet level, and they support Allow rules only.
Security groups
The default group allows inbound communication from other members of the same group and outbound communication to any destination. Traffic can be restricted by any IP protocol, by service port, and by source or destination IP address (individual IP address or CIDR block).
exam – SGs are on the boundary of the instance; instance level firewalls; no numerical priority and no default deny
______________ contains a numbered list of rules, which are evaluated in order, starting with the lowest numbered rule. If a rule matches traffic, the rule is applied even if any higher-numbered rule contradicts it.
A network ACL
Each network ACL has a rule whose number is an asterisk. This rule denies a packet that doesn’t match any of the numbered rules.
What are 2 key properties of security groups?
Security groups in default VPCs allow all outbound traffic.
Custom security groups have no inbound rules, and they allow outbound traffic.
AWS customers typically use ______________ as their primary method of network packet filtering.
security groups
They are more versatile than network ACLs because of their ability to perform stateful packet filtering and to use rules that reference other security groups. However, network ACLs can be effective as a secondary control for denying a specific subset of traffic or providing high-level guard rails for a subnet.
By implementing both network ACLs and security groups as a defense-in-depth means of controlling traffic, a mistake in the configuration of one of these controls will not expose the host to unwanted traffic.
Why do you use security group chaining?
to provide depth of protection… only allow the minimum required to pass a boundary
The inbound and outbound rules are set up so that traffic can only flow from the top tier to the bottom tier and back up again. The security groups act as firewalls to prevent a security breach in one tier from automatically providing subnet-wide access of all resources to the compromised client
________________ acts as a firewall for associated EC2 instances, controlling both inbound and outbound traffic at the instance level. ____________ act as a firewall for associated subnets, controlling both inbound and outbound traffic at the subnet level.
A security group
Network ACLs
A security group acts as a firewall for associated EC2 instances, controlling both inbound and outbound traffic at the instance level. Network ACLs act as a firewall for associated subnets, controlling both inbound and outbound traffic at the subnet level.
Both can have different default configurations depending on how they are created.
Security groups
* Security groups in default VPCs allow all traffic.
* New security groups have no inbound rules and allow outbound traffic.
Network ACLs
* Network ACLs in default VPCs allow all inbound and outbound IPv4 traffic.
* Custom network ACLs deny all inbound and outbound traffic, until you add rules.
SSD-backed volumes are optimized for transactional workloads that involve frequent read/write operations with small I/O size, where the dominant performance attribute is IOPS. What volume type is used in EBS?
io – iops SSDs
Specific use cases for io2 Block Express include:
* Sub-millisecond latency
* Sustained IOPS performance
* More than 64,000 IOPS or 1,000 MiB/s of throughput
Exam – for this one the numbers aren’t needed but will be for the sysops; just know io is for high io needs
Exam: anywhere you see throughput and low cost think st
______________ provides temporary block-level storage for your instance. This storage is located on disks that are physically attached to the host computer
instance store
An instance store is ideal for temporary storage of information that changes frequently, such as buffers, caches, scratch data, and other temporary content. It is also good for data that is replicated across a fleet of instances, such as a load-balanced pool of web servers.
exam – attached to host but not persistent but very fast; use it for cases where fast processing and fault tolerance is able to retry; Reclaimed when the instance is stopped or terminated ==> data will be lost when restarted, etc.
What are 3 EC2 purchase options?
- On-demand
- Savings plans
- Spot instances
most flexible will be the most expensive, so on demand will be the most expensive; watch the key words such as spiky or temporary
With ____________, you can run code without provisioning or managing servers. The service runs your code on a high-availability compute infrastructure and performs all administration of the compute resources.
Lambda
These resources include: * Server and OS maintenance * Capacity provisioning and automatic scaling * Code monitoring and logging
serverless (Lambda) is always a great answer for “saving money” questions
What are the three types of cloud storage? Each storage option has a unique combination of performance, durability, cost, and interface
object, file, and block
EXAM: S3 with cloudfront is always a great answer
Block storage – Enterprise applications like databases or enterprise resource planning (ERP) systems often require dedicated, low-latency storage for each host. This storage is similar to direct-attached storage (DAS) or a storage area network (SAN). Block-based cloud storage solutions like Amazon Elastic Block Store (Amazon EBS) are provisioned with each virtual server and offer the ultra-low latency required for high-performance workloads.
File storage – Many applications must access shared files and require a file system. This type of storage is often supported with a Network Attached Storage (NAS) server. File storage solutions like Amazon Elastic File System (Amazon EFS) are ideal for use cases such as large content repositories, development environments, media stores, or user home directories.
Object storage – Applications developed in the cloud need the vast scalability and metadata of object storage. Object storage solutions like Amazon Simple Storage Service (Amazon S3) are ideal for building modern applications. Amazon S3 provides scale and flexibility. You can use it to import existing data stores for analytics, backup, or archive.
- Amazon EBS for block storage
- Amazon EFS and Amazon FSx for file storage
- Amazon S3 and Amazon S3 Glacier for object storage
What type of storage would be used for WORM to provide legal/hold needs?
EBS
NOTE: S3 object lock can also be used
What storage is typically used with Linux systems to provide NFS file sharing?
EFS
What storage is typically used with Windows systems to provide SMB file sharing?
EFSx
What storage provides SMB, NFS, and iSCSI for Windows, Linux, & MacOS?
EFSx on NetApp ONTAP
What storage is used with high performance computing?
EFSx Luster
what is Amazon S3?
Amazon S3 is object-level storage. An object includes file data, metadata, and a unique identifier. Object storage does not use a traditional file and folder structure.
Great for static content such as a website providing documents, videos, etc.
exam – bucket name must be globally unique; buckets “live” in a region and can’t be replicated outside the region by AWS but can be by the customer; s3 is the cheapest so it’s why logs are stored there
Name 5 use cases for S3 object storage.
- Backup and restore – You can use Amazon S3 to store and retrieve any amount of data, at any time. You can use Amazon S3 as the durable store for your application data and file-level backup and restore processes. Amazon S3 is designed for 99.999999999 percent durability, or 11 9’s of durability.
- Data lakes for analytics – Run big data analytics, artificial intelligence (AI), machine learning (ML), and high-performance computing (HPC) applications to unlock data insights.
- Media storage and streaming – You can use Amazon S3 with Amazon CloudFront’s edge locations to host videos for on-demand viewing in a secure and scalable way. Video on demand (VOD) streaming means that your video content is stored on a server, and viewers can watch it at any time. You’ll learn more about Amazon CloudFront later in this course.
- Static website – You can use Amazon S3 to host a static website. On a static website, individual webpages include static content. They might also contain client-side scripts. Amazon S3’s object storage makes it easier to manage data access, replications, and data protection for static files.
- Archiving and compliance – Replace your tape with low-cost cloud backup workflows, while maintaining corporate, contractual, and regulatory compliance requirements.
_________ are resource-based policies for your S3 buckets.
Bucket policies
Access control for your data is based on policies, such as IAM policies, S3 bucket policies, and AWS Organizations service control policies (SCPs).
“EAR” in the JSON –
{
“Version”: “2012-10-17”, “Statement”: [
{
“Effect”: “Allow”, “Principal”: “”, “Action”: [
“s3:ListBucket”, “s3:GetObject”
], “Resource”: [
“arn:aws:s3:::doc-example-bucket”, “arn:aws:s3:::doc-example-bucket/”
]
}
]
}
exam – be able to recognize “Principal” in that it is different from identity that lives with the account
NOTE:a principal is a specific type of entity that can take actions in AWS, while an identity is the unique identifier associated with that principal. The principal is defined in IAM policies to grant or deny access, and the identity is used for authentication and authorization purposes.
___________ are used to encrypt your data at rest.
Cryptographic keys
Amazon S3 offers three options + customer provided for encrypting your objects:
* Server-side encryption (SSE) with Amazon S3-managed keys (SSE-S3) – When you use SSE-S3, each object is encrypted with a unique key. As an additional safeguard, it encrypts the key itself with a primary key that it regularly rotates. Amazon S3 server-side encryption uses 256-bit Advanced Encryption Standard (AES-256) to encrypt your data.
* Server-side encryption with AWS KMS keys stored in AWS Key Management Service (AWS KMS) (SSE-KMS) – KMS keys stored in SSE-KMS are similar to SSE-S3, but with some additional benefits and charges. There are
separate permissions for the use of a KMS key that provides added protection against unauthorized access of your objects in Amazon S3. SSE-KMS also provides you an audit trail that shows when your KMS key was used, and by whom.
* Dual-layer server-side encryption with AWS KMS keys (DSSE-KMS) – Using DSSE-KMS applies two individual layers of object-level encryption instead of one layer. Each layer of encryption uses a separate cryptographic implementation library with individual data encryption keys.
* Server-side Encryption with Customer-Provided Keys (SSE-C) – With SSE-C, you manage the encryption keys and Amazon S3 manages the encryption as it writes to disks. Also, Amazon S3 manages decryption when you access your objects.
What are the 6 S3 storage classes?
- S3 Standard for general-purpose storage of frequently accessed data.
- S3 Standard-Infrequent Access (S3 Standard-IA) for long-lived, but less frequently accessed data.
- S3 One Zone-Infrequent Access (S3 One Zone-IA) for long-lived, less frequently accessed data that can be stored in a single Availability Zone.
- S3 Glacier Instant Retrieval for archive data that is rarely accessed but requires a restore in milliseconds.
- S3 Glacier Flexible Retrieval for the most flexible retrieval options that balance cost with access times ranging from minutes to hours. Your retrieval options permit you to access all the archives you need, when you need them, for one low storage price–good for unpredictable needs. This storage class comes with multiple retrieval options:
–Expedited retrievals (restore in 1–5 minutes).
–Standard retrievals (restore in 3–5 hours).
–Bulk retrievals (restore in 5–12 hours). Bulk retrievals are available at no additional charge. - S3 Glacier Deep Archive for long-term cold storage archive and digital preservation. Your objects can be restored in 12 hours or less.
S3 has a storage class associated with it. All storage classes offer high durability (99.999999999 percent durability)
exam – very important for the exam! the 6 classes that you need to know, the use case, & the timing; to the right is less expensive with time increasing
__________________ is the only storage class that delivers automatic storage cost savings when data access patterns change
Amazon S3 Intelligent-Tiering
When you assign an object to S3 Intelligent-Tiering, it is placed in the Frequent Access tier which has the same storage cost as S3 Standard. Objects not accessed for 30 days are then moved to the Infrequent Access tier where the storage cost is the same as S3 Standard-IA. After 90 days of no access, an object is moved to the Archive Instant Access tier, which has the same cost as S3 Glacier Instant Retrieval.
S3 Intelligent-Tiering is the ideal storage class for data with unknown, changing, or unpredictable access patterns, independent of object size or retention period. You can use S3 Intelligent-Tiering as the default storage class for virtually any workload, especially data lakes, data analytics, new applications, and user-generated content.
NOTE: know this and understand that life-cycle policies are different that this… I-T is both ways & uses machine learning to know that something is getting accessed frequently/infrequently to move
What are the Amazon S3 Glacier storage class benefits?
1 Cost-effective storage Lowest cost for specific data access patterns
2 Flexible data retrieval Three storage classes with variable access options
3 Secure and compliant Encryption at rest, AWS CloudTrail integration, and retrieval policies
4 Scalable and durable Meets needs from gigabytes to exabytes with 11 9s of durability
What S3 option is used for WORM?
Use S3 Object Lock for data retention or protection.
What does Amazon S3 Versioning provide?
Buckets that use versioning can help you recover objects from accidental deletion or overwrite:
* If you delete an object, instead of removing it permanently, Amazon S3 inserts a delete marker, which becomes the current object version.
* If you overwrite an object, it results in a new object version in the bucket. When S3 Versioning is turned on, you can restore the previous version of the object to correct the mistake.
What are Lifecycle policies used for?
Use S3 Lifecycle polices to transition objects to another storage class. S3 Lifecycle rules take action based on object age.
With S3 Lifecycle policies, you can delete or move objects based on age. You should automate the lifecycle of your data that is stored in Amazon S3. Using S3 Lifecycle policies, you can have data cycled at regular intervals between different Amazon S3 storage types.
In this way, you reduce your overall cost because you are paying less for data as it becomes less important with time. In addition to being able to set lifecycle rules per object, you can also set lifecycle rules per bucket.
Amazon S3 supports a waterfall model for transitioning between storage classes. Lifecycle configuration automatically changes data storage tiers.
What is S3 multipart and when is it automatically used?
With a multipart upload, you can consistently upload large objects in manageable parts. This process involves three steps: * Initiating the upload * Uploading the object parts * Completing the multipart upload
When the multipart upload request is completed, Amazon S3 will recreate the full object from the individual pieces.
exam – anything over 100MB will be uploaded in multipart
Improve the upload process of larger objects with the following features:
* Improved throughput – You can upload parts in parallel to improve throughput. * Quick recovery from any network issues – Smaller part sizes minimize the impact of restarting a failed upload due to a network error.
* Pausing and resuming object uploads – You can upload object parts over time. When you have initiated a multipart upload, there is no expiration. You must explicitly complete or cancel the multipart upload.
* Beginning an upload before you know the final object size – You can upload an object as you are creating it. * Uploading large objects – Using the multipart upload API, you can upload large objects, up to 5 TB.
When are Amazon S3 Event Notifications typically used?
Event driven architectures
With Amazon S3 Event Notifications, you can receive notifications when certain object events happen in your bucket. Event-driven models like this one mean that you no longer need to build or maintain server-based polling infrastructure to check for object changes. You also don’t pay for idle time of that infrastructure when there are no changes to process.
Amazon S3 can send event notification messages to the following destinations: * Amazon Simple Notification Service (Amazon SNS) topics * Amazon Simple Queue Service (Amazon SQS) queues * AWS Lambda functions
You specify the Amazon Resource Name (ARN) value of these destinations in the notification configuration.
In the example, you have a JPEG image uploaded to the images bucket that your website uses. Your website needs to be able to show smaller thumbnail preview images of each uploaded file. When the image object is added to the S3 bucket, an event notification is sent to invoke a series of AWS Lambda functions. The output of your Lambda functions is a smaller version of the original JPEG image and puts the object in your thumbnails bucket. S3 Event Notifications manage the activity in the bucket for you and automate the creation of your thumbnail.
What are some factors to consider with costs of S3 storage?
- Storage – Per-gigabyte cost to hold your objects. You pay for storing objects in your S3 buckets. The rate that you’re charged depends on your objects’ size, how long you stored the objects during the month, and the storage class. You incur per-request ingest charges when using PUT, COPY, or lifecycle rules to move data into any S3 storage class.
- Requests and retrievals – The number of API calls: PUT and GET requests. You pay for requests that are made against your S3 buckets and objects. S3 request costs are based on the request type, and are charged on the quantity of requests. When you use the Amazon S3 console to browse your storage, you incur charges for GET, LIST, and other requests that are made to facilitate browsing.
- Data transfer – Usually no transfer fee for data-in from the internet and, depending on the requester location and medium of data transfer, different charges for data-out.
- Management and analytics – You pay for the storage management features and analytics that are activated on your account’s buckets. These features are not discussed in detail in this course.
S3 Replication and S3 Versioning can have a big impact on your AWS bill. These services both create multiple copies of your objects, and you pay for each PUT request in addition to the storage tier charge. S3 Cross-Region Replication also requires data transfer between AWS Regions.
For high throughput changes to files of varying sizes, a file system will be superior to an object store system. ___________ and _________ are ideal for this use case.
Amazon Elastic File System (Amazon EFS) and Amazon FSx
___________ provides a scalable, elastic file system for Linux-based workloads for use with AWS Cloud services and on-premises resources.
Amazon EFS
exam – only for Linux; it’s is server less as well; fast; scales; pay for only what you need
Name 2 benefits of using EFS
- Amazon EFS uses burst throughput mode to scale throughput based on your storage use.
- Amazon EFS automatically grows and shrinks file storage without provisioning.
__________ for Windows File Server provides fully managed Microsoft Windows file servers that are backed by a native Windows file system. Built on Windows Server, it delivers a wide range of administrative features such as data deduplication, end-user file restore, and Microsoft Active Directory.
Amazon FSx
exam – windows: FSx; fully managed built on windows file server; but can be used by Windows, Linux, & MacOS
What files systems provides high-performance and is generally used with HPC?
FSx Lustre
exam – HPC & machine learning, big data analytics
What 2 AWS data migration tools are used with hybrid architectures?
AWS Storage Gateway: Sync files with SMB, NFS, and iSCSI
protocols from on-premises to AWS. AWS Storage Gateway connects an on-premises software appliance with cloud-based storage
AWS DataSync: Sync files from on-premises file storage to Amazon EFS, Amazon FSx, and Amazon S3.
What storage migration tools provide offline migration support for large volumes of data?
AWS Snow Family: Move terabytes to petabytes of data to AWS by using
appliances that are designed for secure, physical transport.
AWS Snow Family is a group of edge computing, data migration, or edge storage devices that are designed for secure, physical transport.
SFTP is typically used with what data migration service?
AWS Transfer Family permits the transfer of files into and out of Amazon S3 or Amazon Elastic File System (EFS)
What are the 4 Storage Gateway types and uses?
- Amazon S3 File Gateway presents a file interface that you can use to store files as objects in Amazon S3. You use the industry-standard NFS and SMB file protocols. Access your files through NFS and SMB from your data center or Amazon EC2, or access those files as objects directly in Amazon S3.
exam – associate with s3, archiving, data lakes, etc. - Amazon FSx File Gateway provides fast, low-latency, on-premises access to fully managed, highly reliable, and scalable file shares in Amazon FSx for Windows File Server. It uses the industry-standard SMB protocol. You can store and access file data in Amazon FSx with Microsoft Windows features, including full New Technology File System (NTFS) support, shadow copies, and ACLs.
- Tape Gateway presents an iSCSI-based virtual tape library (VTL) of virtual tape drives and a virtual media changer to your on-premises backup application.
- Volume Gateway presents block storage volumes of your applications by using the iSCSI protocol. You can asynchronously back up data that is written to these volumes as point-in-time snapshots of your volumes. Then, you can store it in the cloud as Amazon EBS snapshots.
exam – associate with EBS & your “boot” device
What are the 4 storage gateway modes?
Amazon S3 File Gateway, Amazon FSx File Gateway (use for your home directories), Tape Gateway, or Volume Gateway (iSCSI).
What is a typical use for AWS DataSync?
Reduce on-premises storage infrastructure by shifting SMB-based data stores and content repositories from file servers and NAS arrays to Amazon S3 and Amazon EFS for analytics.
What is AWS Snowcone?
AWS Snowcone
Snowcone is a small, rugged, edge computing and data storage product.
What is AWS Snowball Edge?
Snowball Edge is an edge computing and data transfer device that the AWS Snowball service provides.
exam – has compute with it
Snowball Edge is a petabyte-scale data transport option that doesn’t require you to write code or purchase hardware to transfer data.
What database service provides key value NoSQL?
DynamoDB
What are 2 relational DB services and which one provides PostgreSQL?
RDS & Aurora (provides PostgreSQL)
What database service is used for MemCached & Redis?
ElastiCache
What database service is used for data warehouses?
Redshift
what database services are used for speed and agility providing key-value pairs or document storage while using dynamic schemas?
DynamoDB & ElastiCache
What databases run on physical servers while providing fixed schemas?
RDS & Aurora
Relational databases such as Oracle, IBM DB2, SQL Server, MySQL, and PostgreSQL
- Amazon Aurora is a proprietary, fully-managed relational database engine that is MySQL and PostgreSQL-compatible. In terms of performance, Aurora provides up to five times better latency than RDS and can scale up to ten times more packed operations per second than MySQL engine in RDS. It also offers an encrypted storage option for better data security. It has a serverless option.
- Amazon RDS is a hosted database service that supports a variety of relational database engines, including MySQL, PostgreSQL, MariaDB, Microsoft SQL Server, and Oracle.
NOTE:
In RDS, Failover to read replica is done manually, which could lead to data loss. You can use Multi-AZ (Standby instance) feature for automatic failover, and to prevent downtime and data loss. In Aurora, Failover to read replica is done automatically to prevent data loss. Failover time is faster on Aurora.
What relational database service would you use for a serverless option that is fully managed?
Aurora
What relational database service would you use when you want to avoid controlling the resources such as when you want install it on an Amazon Elastic Compute Cloud (Amazon EC2) instance?
RDS
EXAM: serverless option is Aurora; RDS is managed
Amazon RDS is a web service that helps you to set up, operate, and scale a relational database in the cloud. It provides cost-efficient and resizable capacity, while managing time-consuming database administration tasks. By using Amazon RDS, you can focus on your applications and business. Amazon RDS provides you with six familiar database engines to choose from, including Amazon Aurora, PostgreSQL, MySQL, MariaDB, Oracle Database, and Microsoft SQL Server. Therefore, most of the code, applications, and tools that you already use with your existing databases can be used with Amazon RDS.
Amazon RDS automatically patches the database software and backs up your database. It stores the backups for a user-defined retention period and provides point-in-time recovery. You benefit from the flexibility of scaling the compute resources or storage capacity associated with your relational DB instance with a single API call.
What sits between your application and your relational database to efficiently manage connections to the database and improve scalability of the application?
Amazon RDS Proxy
also know what RDS proxy provides connection pooling, improved scale, improved/faster resiliency, etc.
What do you set up when you want to improve the resiliency of RDS?
Amazon RDS Multi-AZ deployments provide enhanced availability and durability for database (DB) instances, which makes them a natural fit for production database workloads. When you provision a Multi-AZ DB instance, Amazon RDS synchronously replicates the data to a standby instance in a different Availability Zone.
You can modify your environment from Single-AZ to Multi-AZ at any time. Each Availability Zone runs on its own physically distinct, independent infrastructure and is engineered to be highly reliable.
by default is not HA; multi AZ cluster provides read replica access while the single AZ can’t
What is a strategy to improve RDS performance?
Read replicas – for offloading reads; improves performance; takes load off primary
With Amazon RDS, you can create read replicas of your database. Amazon automatically keeps them in sync with the primary DB instance. Read replicas are available in Amazon RDS for MySQL, MariaDB, PostgreSQL, Oracle, and SQL Server as well as Amazon Aurora.