CCP Flashcards
Define Availability Zone
Global Infrastructure composed of one or more discrete data centers with redundant power, networking, and connectivity, and are used to deploy infrastructure.
Types of Cloud Computing
Infrastructure, platform and software as a service.
What is Infrastructure as a service
- Provide building blocks for cloud IT
- Provides networking, computers, data storage space
- Highest level of flexibility
- Easy parallel with traditional on-premises IT
What is Platform as a Service
- Removes the need for your organization to manage the underlying infrastructure
- Focus on the deployment and management of your applications
What is Software as a Service
- Completed product that is run and managed by the service provider
What are the five characteristics of cloud computing?
- On-demand self service:
- Users can provision resources and use them without human interaction from the service
provider - Broad network access:
- Resources available over the network, and can be accessed by diverse client platforms
- Multi-tenancy and resource pooling:
- Multiple customers can share the same infrastructure and applications with security and privacy * Multiple customers are serviced from the same physical resources
- Rapid elasticity and scalability:
- Automatically and quickly acquire and dispose resources when needed * Quickly and easily scale based on demand
- Measured service:
- Usage is measured, users pay correctly for what they have used
What are the 3 Pricing Fundamentals of AWS Cloud?
- Compute:
- Pay for compute time
- Storage:
- Pay for data stored in the Cloud
- Data transfer OUT of the Cloud: * Data transfer IN is free
What are the 4 points of consideration when choosing an AWS Region?
- Compliance with data and governance and legal requirements.
- Proximity to customers (latency)
- Available services and features within a Region
- Pricing.
Define Cloud Computing
On-demand availability of computer system resources, especially data storage (cloud storage), and computing power, without direct active management by the user.
Define IAM Roles
IAM entity that defines a set of permissions for making AWS service requests, that will be used by AWS services.
What is an IAM credential report?
IAM Credentials report lists all your account’s users and the status of their various credentials. The other IAM Security Tool is IAM Access Advisor. It shows the service permissions granted to a user and when those services were last accessed.
What are IAM Policies?
An IAM policy is an entity that, when attached to an identity or resource, defines their permissions.
Json documents to define users, groups, permissions.
What are EC2 Capabilities?
- Renting virtual machines (EC2)
- Storing data on virtual drives (EBS)
- Distributing load across machines (ELB)
- Scaling the services using an auto-scaling group (ASG)
What is EC2 On Demand?
- Pay for what you use:
- Linux or Windows - billing per second, after the first minute * All other operating systems - billing per hour
- Has the highest cost but no upfront payment
- No long-term commitment
- Recommended for short-term and un-interrupted workloads, where you can’t predict how the application will behave
What is EC2 Reserved?
- Up to 72% discount compared to On-demand
- You reserve a specific instance attributes (Instance Type, Region,Tenancy, OS) * Reservation Period – 1 year (+discount) or 3 years (+++discount)
- Payment Options – No Upfront (+), Partial Upfront (++), All Upfront (+++) * Reserved Instance’s Scope – Regional or Zonal (reserve capacity in an AZ)
- Recommended for steady-state usage applications (think database)
- You can buy and sell in the Reserved Instance Marketplace
- Convertible Reserved Instance
- Can change the EC2 instance type, instance family, OS, scope and tenancy * Up to 66% discount
What is EC2 Savings Plan?
- Get a discount based on long-term usage (up to 72% - same as RIs) * Commit to a certain type of usage ($10/hour for 1 or 3 years)
- Usage beyond EC2 Savings Plans is billed at the On-Demand price
- Locked to a specific instance family & AWS region (e.g., M5 in us-east-1)
- Flexible across:
- Instance Size (e.g., m5.xlarge, m5.2xlarge) * OS (e.g., Linux, Windows)
- Tenancy (Host, Dedicated, Default)
one- or three-year hourly spend commitment
What is EC2 Spot?
- Can get a discount of up to 90% compared to On-demand
- Instances that you can “lose” at any point of time if your max price is less than the current spot price
- The MOST cost-efficient instances in AWS
- Useful for workloads that are resilient to failure * Batch jobs
- Data analysis
- Image processing
- Any distributed workloads
- Workloads with a flexible start and end time
- Not suitable for critical jobs or databases
What are EC2 dedicated host?
- A physical server with EC2 instance capacity fully dedicated to your use
- Allows you address compliance requirements and use your existing server- bound software licenses (per-socket, per-core, pe—VM software licenses)
- Purchasing Options:
- On-demand – pay per second for active Dedicated Host
- Reserved - 1 or 3 years (No Upfront,Partial Upfront,All Upfront)
- The most expensive option
- Useful for software that have complicated licensing model (BYOL – Bring Your
Own License) - Or for companies that have strong regulatory or compliance needs
NOT FOR DISTRIBUTION © Stephane Maarek www.datacumulus.com
What are EC2 dedicated instances?
- Instances run on hardware that’s dedicated to you
- May share hardware with other instances in same account
- No control over instance placement (can move hardware after Stop / Start)
What are EC2 Security Groups?
security tool can you use to control traffic in and out of EC2 Instances
- Security groups are acting as a “firewall” on EC2 instances
- They regulate:
- Access to Ports
- Authorised IP ranges – IPv4 and IPv6
- Control of inbound network (from other to the instance)
- Control of outbound network (from the instance to other)
- Can be attached to multiple instances
- Locked down to a region / VPC combination
- Does live “outside” the EC2 – if traffic is blocked the EC2 instance won’t see it
- It’s good to maintain one separate security group for SSH access
- If your application is not accessible (time out), then it’s a security group issue
- If your application gives a “connection refused“ error, then it’s an application error or it’s not launched
- All inbound traffic is blocked by default
- All outbound traffic is authorised by default
- 22 = SSH (Secure Shell) - log into a Linux instance
- 21 = FTP (File Transfer Protocol) – upload files into a file share
- 22 = SFTP (Secure File Transfer Protocol) – upload files using SSH
- 80 = HTTP – access unsecured websites
- 443 = HTTPS – access secured websites
- 3389 = RDP (Remote Desktop Protocol) – log into a Windows instance
What is EC2 Compute Optimized?
Compute Optimized EC2 instances are great for compute-intensive workloads requiring high performance processors, such as batch processing, media transcoding, high performance web servers, high performance computing, scientific modeling & machine learning, and dedicated gaming servers.
What is EFS?
Amazon EFS is a fully managed service that makes it easy to set up, scale, and cost-optimize file storage in the Amazon Cloud.
EFS is ideal for storing dynamic files, such as code, configuration, logs, and databases, that require frequent updates or complex operations.
What is EC2 Image Builder?
EC2 Image Builder is an automated pipeline for the creation, maintenance, validation, sharing, and deployment of Linux or Windows images for use on AWS and on-premises.
What is EFS?
Elastic File System.
* Managed NFS (network file system) that can be mounted on 100s of EC2
* EFS works with Linux EC2 instances in multi-AZ
* Highly available, scalable, expensive (3x gp2), pay per use, no capacity planning
What is EBS
Elastic Block Storage. * An EBS (Elastic Block Store) Volume is a network drive you can attach
to your instances while they run
* It allows your instances to persist data, even after their termination
* They can only be mounted to one instance at a time (at the CCP level)
* They are bound to a specific availability zone
* Analogy: Think of them as a “network USB stick”
* Free tier: 30 GB of free EBS storage of type General Purpose (SSD) or
Magnetic per month
What is FSx?
Amazon FSx makes it easy and cost effective to launch and run popular 3P file systems that are fully managed by AWS. It comes in two offerings: FSx for Windows File Server (used for business applications), and FSx for Lustre (used for high-performance computing).
What is AMI?
An Amazon Machine Image (AMI) is a supported and maintained image provided by AWS that provides the information required to launch an instance. You must specify an AMI when you launch an instance. You can launch multiple instances from a single AMI when you require multiple instances with the same configuration. You can use different AMIs to launch instances when you require instances with different configurations.
An AMI includes the following:
One or more Amazon Elastic Block Store (Amazon EBS) snapshots, or, for instance-store-backed AMIs, a template for the root volume of the instance (for example, an operating system, an application server, and applications).
Launch permissions that control which AWS accounts can use the AMI to launch instances.
A block device mapping that specifies the volumes to attach to the instance when it’s launched.
What is AMI? Tutorial Definition.
- AMI = Amazon Machine Image
- AMI are a customization of an EC2 instance
- You add your own software, configuration, operating system, monitoring…
- Faster boot / configuration time because all your software is pre-packaged
- AMI are built for a specific region (and can be copied across regions)
- You can launch EC2 instances from:
- A Public AMI: AWS provided
- Your own AMI: you make and maintain them yourself
- An AWS Marketplace AMI: an AMI someone else made (and potentially sells)
EBS Volumes can be attached to how many instances?
EBS Volumes can be attached to only one EC2 Instance, but EC2 Instances can have multiple EBS Volumes attached to them.
What is EC2 Instance Store?
An instance store provides temporary block-level storage for your instance. This storage is located on disks that are physically attached to the host computer. Instance store is ideal for temporary storage of information that changes frequently, such as buffers, caches, scratch data, and other temporary content. It can also be used to store temporary data that you replicate across a fleet of instances, such as a load-balanced pool of web servers.
EC2 Instance Store has a better I/O performance, but data is lost if: the EC2 instance is stopped or terminated, or when the underlying disk drive fails.
What is An EBS Snapshot?
EBS Snapshots are used to backup data on your EBS Volumes at a point in time.
Define High Availability?
High Availability means applications running at least in two AZs to survive a data center loss.
What is a network load balancer?
A Network Load Balancer can handle millions of requests per second with low-latency. It operates at Layer 4, and is best-suited for load-balancing TCP, UDP, and TLS traffic with ultra high-performance.
What is Vertical Scaling?
Vertical scaling means increasing the size of the instance. Changing from a t3a.medium to a t3a.2xlarge is an example of size increase.
What is an ASG?
An Auto Scaling Group (ASG) can automatically and quickly scale-in and scale-out to match the changing load on your applications and websites.
What are the 4 kinds of Load Balancers?
Application Load Balancer (HTTP / HTTPS only) – Layer 7
* Network Load Balancer (ultra-high performance, allows for TCP) – Layer 4
* Gateway Load Balancer – Layer 3
* Classic Load Balancer (retired in 2023) – Layer 4 & 7
The goal of an Auto Scaling Group (ASG) is to:
- Scale out (add EC2 instances) to match an increased load
- Scale in (remove EC2 instances) to match a decreased load
- Ensure we have a minimum and a maximum number of machines running
- Automatically register new instances to a load balancer
- Replace unhealthy instances
- Cost Savings: only run at an optimal capacity (principle of the cloud)
Auto Scaling Groups – Scaling Strategies
- Manual Scaling: Update the size of an ASG manually
- Dynamic Scaling: Respond to changing demand
- Simple / Step Scaling
- When a CloudWatch alarm is triggered (example CPU > 70%), then add 2 units
- When a CloudWatch alarm is triggered (example CPU < 30%), then remove 1
- Target Tracking Scaling
- Example: I want the average ASG CPU to stay at around 40%
- Scheduled Scaling
- Anticipate a scaling based on known usage patterns
- Example: increase the min. capacity to 10 at 5 pm on Fridays
- Predictive Scaling
- Uses Machine Learning
to predict future traffic
ahead of time - Automatically
provisions the right
number of EC2
instances in advance
What does an Elastic Load Balancers (ELB) do?
- Distribute traffic across backend EC2 instances, can be Multi-AZ
- Supports health checks
- 4 types: Classic (old), Application (HTTP – L7), Network (TCP – L4), Gateway (L3)
spreads load across multiple downstream instances
handles failure of downstream instances.
What are the S3 storage classes?
- Amazon S3 Standard - General Purpose
- Amazon S3 Standard-Infrequent Access (IA)
- Amazon S3 One Zone-Infrequent Access
- Amazon S3 Glacier Instant Retrieval
- Amazon S3 Glacier Flexible Retrieval
- Amazon S3 Glacier Deep Archive
- Amazon S3 Intelligent Tiering
What is Aws Storage Gateway?
AWS Storage Gateway is a hybrid cloud storage service that gives you on-premises access to virtually unlimited cloud storage.
What is the AWS Snow Family?
- Highly-secure, portable devices to collect and process data at the edge, and migrate data into and out of AWS
What is Snowball Edge
- Physical data transport solution: move TBs or PBs of data in or out
of AWS - Alternative to moving data over the network (and paying network
fees) - Pay per data transfer job
- Provide block storage and Amazon S3-compatible object storage
- Snowball Edge Storage Optimized
- 80 TB of HDD capacity for block volume and S3 compatible object
storage - Snowball Edge Compute Optimized
- 42 TB of HDD or 28TB NVMe capacity for block volume and S3
compatible object storage - Use cases: large data cloud migrations, DC decommission, disaster
What is Snowcone/Snowcone SSD
AWS Snowcone & Snowcone SSD
* Small, portable computing, anywhere, rugged & secure,
withstands harsh environments
* Light (4.5 pounds, 2.1 kg)
* Device used for edge computing, storage, and data
transfer
* Snowcone – 8 TB of HDD Storage
* Snowcone SSD – 14 TB of SSD Storage
* Use Snowcone where Snowball does not fit (spaceconstrained
environment)
* Must provide your own battery / cables
* Can be sent back to AWS offline, or connect it to
internet and use AWS DataSync to send data
AWS Snowmobile
- Transfer exabytes of data (1 EB = 1,000 PB = 1,000,000 TBs)
- Each Snowmobile has 100 PB of capacity (use multiple in parallel)
- High security: temperature controlled, GPS, 24/7 video surveillance
- Better than Snowball if you transfer more than 10 PB
What are Access Keys used for?
Access Keys are used to sign programmatic requests to the AWS CLI or AWS API.
Where are objects stored in Amazon S3?
Buckets
What is Storage Gateway?
S3 hybrid solution to extend on-premises storage to S3
What do you use a S3 bucket policy for?
- Grant public access to the bucket
- Force objects to be encrypted at upload
- Grant access to another account (Cross
Account)
What are Lifecycle Rules?
Lifecycle Rules can be used to define when S3 objects should be transitioned to another storage class or when objects should be deleted after some time.
what is snowball edge storage optimized?
Snowball Edge Storage Optimized devices are well suited for large-scale data migrations and recurring transfer workflows, as well as local computing with higher capacity needs.
What is snowball edge compute optimized?
smaller than storage optimized:
- 104 vCPUs, 416 GiB of RAM
- Optional GPU (useful for video processing or machine learning)
- 28 TB NVMe or 42TB HDD usable storage
- Up to 40 vCPUs, 80 GiB of RAM, 80 TB storage
- Object storage clustering available
What are the S3 Storage Classes?
- Amazon S3 Standard - General Purpose
- Amazon S3 Standard-Infrequent Access (IA)
- Amazon S3 One Zone-Infrequent Access
- Amazon S3 Glacier Instant Retrieval
- Amazon S3 Glacier Flexible Retrieval
- Amazon S3 Glacier Deep Archive
- Amazon S3 Intelligent Tiering
- Can move between classes manually or using S3 Lifecycle configurations
What is S3 Standard – General Purpose?
- 99.99% Availability
- Used for frequently accessed data
- Low latency and high throughput
- Sustain 2 concurrent facility failures
- Use Cases: Big Data analytics, mobile & gaming applications, content
distribution…
What is S3 Storage Classes – Infrequent Access?
- For data that is less frequently accessed, but requires rapid access when needed
- Lower cost than S3 Standard
- Amazon S3 Standard-Infrequent Access (S3 Standard-IA)
- 99.9% Availability
*suitable for less frequently accessed data, but with rapid access when needed, while keeping a high durability and allowing an Availability Zone failure - Use cases: Disaster Recovery, backups
- Amazon S3 One Zone-Infrequent Access (S3 One Zone-IA)
- High durability (99.999999999%) in a single AZ; data lost when AZ is destroyed
- 99.5% Availability
- Use Cases: Storing secondary backup copies of on-premise data, or data you can recreate
What is Managed Blockchain?
Amazon Managed Blockchain is a fully managed service that makes it easy to create and manage scalable blockchain networks using the popular open source frameworks Hyperledger Fabric and Ethereum.
What is Redshift?
Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud.
- Redshift is based on PostgreSQL, but it’s not used for OLTP
- It’s OLAP – online analytical processing (analytics and data warehousing)
- Load data once every hour, not every second
- 10x better performance than other data warehouses, scale to PBs of data
- Columnar storage of data (instead of row based)
- Massively Parallel Query Execution (MPP), highly available
- Pay as you go based on the instances provisioned
- Has a SQL interface for performing the queries
- BI tools such as AWS Quicksight or Tableau integrate with it
What is Amazon Athena?
- Serverless query service to analyze data stored in Amazon S3
- Uses standard SQL language to query the files
- Supports CSV, JSON, ORC, Avro, and Parquet (built on Presto)
- Pricing: $5.00 per TB of data scanned
- Use compressed or columnar data for cost-savings (less scan)
- Use cases: Business intelligence / analytics / reporting, analyze &
query VPC Flow Logs, ELB Logs, CloudTrail trails, etc… - Exam Tip: analyze data in S3 using serverless SQL, use Athena
What is AWS Glue?
AWS Glue is a fully managed extract, transform, and load (ETL) service that makes it easy for customers to prepare and load their data for analytics.
What is Amazon Aurora?
Amazon Aurora is a MySQL and PostgreSQL-compatible relational database built for the cloud, that combines the performance and availability of traditional enterprise databases with the simplicity and cost-effectiveness of open source databases. It is a proprietary technology from AWS.
What is AWS Database Migration?
AWS Database Migration Service helps you migrate databases to AWS quickly and securely. The source database remains fully operational during the migration, minimizing downtime to applications that rely on the database.
What is Amazon EMR?
- EMR stands for “Elastic MapReduce”
- EMR helps creating Hadoop clusters (Big Data) to analyze and process
vast amount of data - The clusters can be made of hundreds of EC2 instances
- Also supports Apache Spark, HBase, Presto, Flink…
- EMR takes care of all the provisioning and configuration
- Auto-scaling and integrated with Spot instances
- Use cases: data processing, machine learning, web indexing, big data…
What is Elasticache?
Amazon ElastiCache is a web service that makes it easy to deploy and run Memcached or Redis protocol-compliant server nodes in the cloud. ElastiCache caches are in-memory databases with high performance, low latency. They help reduce load off databases for read intensive workloads.
What is a Glue Data Catalog?
A central repository to store structural and operational metadata for data assets in AWS Glue?
What is RDS?
Amazon Relational Database Service (Amazon RDS) is a SQL managed service that makes it easy to set up, operate, and scale a relational database in the cloud. It is suited for OLTP workloads
What is DynamoDB?
Amazon DynamoDB is a fully managed, serverless, key-value NoSQL database designed to run high-performance applications at any scale. DynamoDB offers built-in security, continuous backups, automated multi-Region replication, in-memory caching, and data import and export tools.
- Fully Managed Highly available with replication across 3 AZ
- NoSQL database - not a relational database
- Scales to massive workloads, distributed “serverless” database
- Millions of requests per seconds, trillions of row, 100s of TB of storage
- Fast and consistent in performance
- Single-digit millisecond latency – low latency retrieval
- Integrated with IAM for security, authorization and administration
- Low cost and auto scaling capabilities
- Standard & Infrequent Access (IA) Table Class
What QLDB?
Amazon QLDB is a fully managed ledger database that provides a transparent, immutable, and cryptographically verifiable transaction log owned by a central trusted authority. Amazon QLDB tracks each and every application data change and maintains a complete and verifiable history of changes over time.
What is Neptune?
Amazon Neptune is a fast, reliable, fully-managed graph database service that makes it easy to build and run applications that work with highly connected datasets. It can be used for knowledge graphs, fraud detection, recommendations engines, social networking, etc.
What is quicksight?
Amazon QuickSight is a fast, cloud-powered business intelligence (BI) service that makes it easy for you to deliver insights to everyone in your organization. You can create and publish interactive dashboards.