Containers & ECS Flashcards
Virtualization & Containerization
-Introduction to Containers-
Virtualization Problems
Virtualization is the process of running multiple O.S on the same physical hardware.
-Heavy Usage & Duplication
The architecture is:
- (APP #1/RUNTIME/GUEST OS) (APP #2/RUNTIME/GUEST OS)
2.AWS Hypervisor (NITRO)
1.AWS EC2 Host
Containerization
The architecture is:
- (APP #1/RUNTIME) (APP #2/RUNTIME) (APP #3/RUNTIME) (APP #4/RUNTIME) = Consumes very little memory and disk, with their own libraries and dependencies.
3.Container Engine = Provides an isolated environment, where an application can run within. (Isolated from ALL the other processes, but it can use the host O.S for a lot of things (Networking & File IO)) (Docker)
2.Guest O.S
1.Host Hardware
Image Anatomy
An EC2 Instance is a running copy of it’s EBS volume, it’s virtual disks. An EC2 Instance’s boot volume is used, it’s booted and using this, you end up with a running copy of an OS running in a virtualizaed environment.
A container is a running copy of what’s known as a “Docker Image”, they are made up of multiple independent layers. So Docker Images, are stacks of these layers and not a single monolithic disk image. Docker Images are created initially by using a DockerFile.
-DockerFile - used to build docker images. Each line is processed one by one and creates a new file system layer, inside the Docker Image that it creates.
-Images are createsd from a base image or scratch
- First Layer = Instructs Docket to create our Docker Image, using as a basis this base image (for example, basic CentOS 7 distribution)
- Second Layer = Performs software updates and it instals, a web server (Apache)
- Third Layer = Adds a script
-Images contain readonly layers, changes are layered onto the image using a differential architecture
Container Anatomy
A Docker Container is just a running copy of Docker Image, with one crucial difference:
-Has a additional read-write file system layer (allows containers to run)
-Anything which happens in the container, if log files are generated or if an applications generates or reads data, that’s all stored in the read-write layer of the container
-Each layer is differential, and so it stores only the changes made against it versus the layers below
We could use the same Docker Image to create anothe Docker Container
-This one is almost identical, the only difference is the R/W layer, which is different in both of these containers. (keeps things isolated)
-Disk usage, when you have lots of containers is minimized, because of this layered architecture, and the base layers, the O.S, they’re generally made available by the O.S vendors, via something called “Container Registry”. (Docker Hub)
-Docker Hub = It’s a registry or a hub of container images
As a developer or achitect, you make use a DockerFile to create a container image, and then you upload that image to a private repository or a public one, such as the Docker Hub. From there, these containers images, can be deployed to Docker Hosts, which are just servers running a container engine, in this case, Docker.
Docker Hosts can run many containers based on one or more images, and a single image can be used to generate containers on many different Docker Hosts.
Container Key Concepts
-DockerFiles are used to build images
-Docker Images are these multi-layer file system images, which are used to run containers
-Portable - self-contained, always run as expected
-Lightweight - Parent OS used, fs layers are shared
-Containers only runs the application & environment it needs (only what it needs)
-Provides much of the isolation VM’s do
-Ports are “EXPOSED” to the host and beyond
-Applications stacks can be multi-container
Elastic Container Service (ECS) Concepts
Is a product which allows you to use containers running on infrastructure, which AWS fully manage or partially manage.
ECS uses clusters, which run in one of two modes, EC2 mode which uses EC2 instances as container hosts or fargate mode, which is a serverless way of running Docker containers, where AWS manage the container host part, and just leave you to define an architect to your environment using containers
-ECS is a service that accepts containers and some instructions that you provide and it orchestrates where and how to run containers.
-It’s a managed container based compute service
-ECS let’s you create a cluster - Clusters are where your containers run from, you provide ECS with a container image and it runs in the form of a container, in the cluster based on how you want to run.
-AWS Elastic Container Registry - ECR
-To tell ECS about your container images, you create what’s known as a “Container Definition” - This tells ECS where your container image is, it tells ECS which port your container uses.
-Task Definition - A task in ECS represents a self-contained application. A task could have one container defined inside it or many. Represents the application as whole, and so it stores whatever container definitions are used to make up that one single application.
Stores the resources used by the task, so CPU and Memory. They store the networking mode that the task uses, they store the compatibility, so whether the task will work on EC2 mode or Fargate.
-The Task Definition stores the Task Role - Is an IAM role that a task can assume, it gains temporary credentials, which can be used within the task to interact with AWS resources. Task Roles are the best practice way of giving containers within ECS, permissions to access AWS products and services. ***
-When you create a Task Definition, you actually create a Container Definition along with it
-A task in ECS, doesn’t scale own it’s own and it isn’t HA
-ECS Service is configured via “Service Definition” - It defines a service and a service is how for ECS, we can define how we want a task to scale, how many copies we’d like to run, it can add capacity and it can add resilience, because we can have multiple independent copies of our task running and you can deploy a Load Balancer in front of a service, so the incoming load is distributed across all of the trasks inside a service.
So for tasks that you’re running inside ECS, that are long running and business critical, you would generally use a service to provide that level of scalability and HA. It’s a service which let’s you configure replacing failed tasks or scaling, or how to distribute load across multiple copies of the same task.
-It’s tasks or services, that you deploy into an ECS Cluster (applies to both modes)
ECS Concepts
-Container Definition - Image & Ports
-Task Definition - Security (TaskRole) & Container(s), Resources
-Task Role - IAM Role which the TASK assumes
-Service - How many copies (scaling), HA, Restarts
ECS Cluster Types - EC2 Mode
The cluster mode defines a number of things, but one of them is how much of the admin overhead surrounding running a set of container hosts that you manage versus how much AWS manage.
Using EC2 mode, we start with the ECS management components, these handle high-level tasks like scheduling, orchestration, cluster management, and the placement engine which handles where to run containers. (which container host)
-SchedulingandOrchestration
-ClusterManager
-PlacementEngine
These exist in both modes, with EC2 mode, an ECS cluster is created within a VPC inside your AWS account. Because an EC2 mode cluster runs within a VPC, it benefits from the multiple AZs, which are available within this VPC.
With EC2 mode, EC2 instances are used to run containers, and when you create the cluster, you specify an initial size, which controls the number of container instances. (handled by a ASG)
When these are provisioned, you will be paying for them, regardless of what containers you have running on them, so you are responsible for these EC2 instances.
-ECS provisions these container hosts, but you manage them, through the ECS tooling
-It’s not serverless, you have to worry about capacity and HA
-In EC2 mode, ECS will handle certain elements (ECR > TASK-SERVICE > DEPLOY)
ECS will handle the number of tasks that are deployed, if you utilize services and serviec definitions. But at the cluster level, you need to be aware of, and manage the capacity of the cluster, because the container instances are not something that’s delivered as a managed service.
-If you want to use containers in your infrastructure, but manage the container host, capacity and availability, then EC2 mode is for you.
ECS Cluster Types - Fargate Mode
-You don’t have to manage EC2 instances for use as container hosts
-Fargate is serverless, which means you have no service to manage, because of this, you aren’t paying for EC2 instances, regardless of whether you’re using them or not.
It uses the same surrounding technologies, so you still have the Fargate handling schedule and orchestration, cluster management and placement, and you still use registries for the container images, as well as use task and service definitions to define tasks and services.
What differs is how containers are actually hosted.
-AWS maintain a shared Fargate infrastructure platform - This shared platform is offered to all users of Fargate, but it’s isolated from others. (just like EC2)
-With Fargate, you use the same task and service definitions, and these define the image to use, the ports and how much resources you need, these are allocated to the shared “Fargate Platform”.
-You still have your VPC, a Fargate deployment still uses a cluster and a cluster user of VPC, which operates in AZs.
Where it starts to differ, is for ECS tasks, which they’re running on the shared infrastructure, but from a networking perspective, they’re injected into your VPC and it gets given an ENI. (This ENI has an IP address within the VPC)
-If the VPC is configured to use public subnets, which automatically allocate an IP version for address, then tasks and services can be given public IP version for addressing.
-With Fargate, you can deploy exactly how you want into either a new VPC or a custom VPC that you have designed and implemented in AWS.
-Since tasks and services, are running from the shared infrastructure platform, you only pay for the containers that your’re using, based on the resources that they consume. (Container resources)
-You don’t need to manage hosts, provision hosts, or think about capacity and availability.
EC2 natively vs ECS (EC2) vs Fargate
-If you use containers > ECS
-Large workload - price conscious > EC2 Mode
-Large workload - overhead conscious > Fargate
-Small / Burst workloads - Fargate
-Batch / Periodic workloads - Fargate
Elastic Container Registry (ECR)
-Managed Container Image registry service
-Like Docker Hub .. but for AWS
-Contains images which can be used within Docker or other container applications, such as, ECS or EKS
-Each AWS account has a public and private registry
-Each registry can have many repositories (like GitHub)
-Each repository can contain many images
-Images can have several tags (tags need to be unique)
-Public = public R/O (read only access)… R/W require permissions
-Private = permissions required for any R/O or R/W
ECR - Benefits
-Integrated with IAM - Permissions
ECR offers security scanning on images and this comes in two different flavors:
-Image scanning, BASIC and ENHANCED (uses the inspector product)
This can scan looking for issues, with both the O.S and any software packages within your containers (works on a layer-by-layer basis)
-Offers near real-time Metrics => CW (auth,push,pull)
-Logs ALL API actions into CloudTrail
-Can generate Events into EventBridge
-Offers Replication of container images… both Cross-Region and Cross-Account
Kubernetes 101
Is an open-source container orchestration system, you use it to automate the deployment, scaling, and management of containerized applications. Kubernetes lets you run containers in a reliable and scalable way, making efficient use of resources, and let’s you expose your containerized applications to the outside world or your business.
It’s like Docker, only with robots to automate it and super intelligence for all of the thinking.
-You can use it with On-premises and many public cloud platforms.
Cluster Structure
-A cluster in Kubernetes is a highly available cluster of compute resources, which are organized to work as one unit.
-The cluster starts with the “Cluster Control Plane”, which is the part, which manages the cluster. It performs shceduling, application management, scaling, deployment, and much more.
-Compute within a Kubernetes cluster is provided via nodes, these are VM or Physical server, which functions as a worker in the cluster. These are the things, which actually run your containerized applications.
-Running on each of the nodes is software and at minimum, this is “containerd” or another container run time, which is the software used to hanlde your container operations.
-Kubelet - agent to interact with the cluster control plane. They run on each of the nodes, which communicates with the Cluster Control Plane using Kubernetes API.
-Kubernetes API - used for communication between Control Plane and Kubelet agent.
Cluster Detail
Run inside Nodes:
-The cluster will likely have many more nodes
-Pods - are the smallest units of computing in Kubernetes Shared sotrage and networking “one-container-one-pod” is v. Common PODS ARE NONPERMANENT (TEMPORARY & NOT HA)
-The Pods handle the containers within them
-kube-proxy is a network proxy. Running on each node, it coordinates networking with the Control Plane. It helps implement “services” and configures rules allowing communications with Pods from inside or outside of the cluster.
Run inside Control Plane:
-Kubernetes API known as “kube-apiserver” - the front end for the Control Plane. It’s what nodes and other cluster elements interact with. Can be horizontally scaled for HA and performance.
-etcd - provides a highly-available key value store used within the cluster. It’s used as the main backing store for the cluster. (Simple database which acts as the main backing store for data for the cluster)
-kube-scheduler - identifies any Pods within the cluster with no assigned node and assigns a node based on resources requirements, deadlines, affinity/anti-affinity, data locality, and any constraints. (make sure the nodes get utilized effectively)
-cloud-controller-manager - provides cloud-specific control logic. Allows you to link Kubernetes with a cloud providers APIs (AWS/Azure/GCP) (OPTIONAL)
-kube-controller-manager cluster controller processes
-Node Controller - Monitoring and responding to node outages
-Job Controller - Runs Pods in order to execute jobs
-Endpoint Controller - populates endpoints (links Services to PODS)
-Service Account & Token Controllers - Account/API Tokens creation
Kubernetes Summary
-Cluster - A deployment of Kubernetes, management, orchestration
Within a Cluster:
-Node - Provide compute resources; pods are placed on nodes to run
-Pod - 1+ Containers; smallest admin unit in Kubernetes; often 1-container = 1-pod
-A pod is NOT PERMANENT, the cluster can and does replace them as required.
-Services - Provide abstraction from pods, service running on 1 or more pods (applications)
-Job - ad-hoc, creates one or more pods until completion
-Ingress - Exposes a way into a service (Ingress=>Routing=>Service=>1+ Pods)
-Ingress Controller - used to provide ingress (e.g. AWS LB Controller uses ALB/NLB)
-Any storage in Kubernetes by default is Ephemeral, proided locally by a node. If a Pod moves between Nodes, then that storage is lost.
-You can configure Persistent Storage (PV) - Volume whose lifecycle lives beyond any 1 pod using it (this is how you provision long-running storage to your containerized applications)