KCNA Flashcards

1
Q

Idea of Cloud Native Architecture

A

Optimize software for cost efficiency and reliability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Definition of Cloud Native

A

Build and run scalable applications in modern and dynamic environments

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Reason for Cloud Native

A

Come away from the monolithic approach

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Characteristics of Cloud Native Applications

A
  • High level of automation
  • Self healing
  • Scalable
  • (Cost-) Efficient
  • Easy to maintain
  • Secure by default
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Autoscaling

A
  • vertical scaling
    • add more CPU & RAM to the VM so it can handle more
  • horizontal scaling
    • add more servers / racks to the underlying infrastructure so loadbalancing can kick in
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Serverless

A
  • servers are still ofc required
  • developers don’t have to deal with things like network, Vms, operating systems etc..
  • Function as a Service (FaaS) by cloud vendors
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Open Standards

A
  • OCI Spec: image, runtime and distribution specification on how to run, build and distribute containers
  • Container Network Interface (CNI): Networking for Containers
  • Container Runtime Interface (CRI): Runtimes in Container Orchestration Systems
  • Container Storage Interface (CSI): Storage in Container Orchestration Systems
  • Service Mesh Interface (SMI)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Use of Containers

A
  • running applications more efficiently
  • manage dependencies the application needs easily
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Container basics

A
  • originates from the “chroot” command
  • nowadays, namespaces and cgroups are used
  • share kernel of host machine
  • are only isolated processes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

4 Cs (outter to inner)

A
  • Cloud
    • Cluster
    • Container
    • Code
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Security with Containers

A
  • don’t rely on the isolation properties for security
  • containers share kernel with host → risk
  • containers can have kernel capabilities which increase the risk
  • execution of processes with too many privileges such as root or admin
  • use of public images is also a risk
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Container Orchestration Fundamentals

A
  • schedule multiple containers to servers in an efficient way
  • allocate resources to containers
  • Manage availability of containers
  • Scale containers if load increases
  • provide networking to connect containers
  • provision storage for persistent container data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Networking within Containers

A
  • Microservice has interface implemented that can be called for a request
  • Network namespaces allow own unique IP adresses and (same) port allocations
  • container ports can be mapped to a host ports for accessability
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Service Discovery

A

DNS:
- modern DNS servers with API to register new services
Key-Value-Store:
- database to store information about services, f.e. etcd, Consul or Apache Zookeeper

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Service Mesh

A
  • adds a proxy server to every container in your architecture
  • this can modify and/or filter network traffic between server and client
  • nginx, haproxy and envoy are techs for this
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Storage

A
  • containers are ephemeral, read-only, consist of layers
  • since a lot of applications need to write files, a read-write layer is put on top of the container image
  • to persist data, you need to write it to a disk
  • volumes can be used for this
17
Q

Kubernetes Architecture

A
  • used as a cluster, spanned across multiple servers, works ondifferent tasks and distributes load
  • high horizontal scalability
  • consists of Control plane and Worker nodes
18
Q

K8s Control plane

A

kube-apiserver:
- centerpiece of k8s. All the other components interact with it

etcd:
- the database which holds the state of the cluster

kube-scheduler:
- chooses the worker node that could fit a workload that should be scheduled based on properties like CPU and RAM

kube-controller-manager:
- contains non-terminating control loops that manage the state of the cluster. F.e. this checks if a desired number of your application is available at all times

cloud-controller-manager (optional):
- can be used to interact with the API of cloud providers, to create external resources like load balancers, storage or security groups

19
Q

K8s Worker node

A

container runtime:
- responsible for running containers on the worker node, f.e. docker or containerd

kubelet:
- small agent that runs on every worker node. Talks to the api-server and container runtime to handle the starting of containers

kube-proxy:
- a network proxy that handles inside and outside communication of the cluster. Tries to rely on the networking capabilitiesof the underlying operating system

20
Q

K8s API

A
  • responsible for communication with the cluster
  • Authentication
  • Authorization
  • Admission Control
21
Q

Containers in K8s

A
  • pods are the smallest compute unit

containerd:
- lightweight and performant implementation to run containers
- most popular container runtime atm
- used by all major cloud providers for Kubernetes As A Service producs

CRI-O
- created by Red Hat, similar to podman and buildah

Docker
- long time standard, but never really made for container orchestration
- usage of docker will be deprecated in Kubernetes 1.23

22
Q

Networking in K8s

A
  1. Container-to-Container communications
    • This can be solved by the Pod concept
  2. Pod-to-Pod communications
    • This can be solved with an overlay network
  3. Pod-to-Service communications
    • Implemented by the kube-proxy and the packet filter on the node
  4. External-to-Service communications
    • Implemented by the kube-proxy and the packet filter on the node
  • Every Pod gets own IP
  • core-dns for name resolution and service discovery
23
Q

K8s scheduling

A
  • process of choosing the right worker node to run a containerized workload on
  • kube-scheduler make the scheduling decisions, but is not responsible for starting it
  • scheduling starts when a new Pod is created
  • users have to give information about application requirements
  • if multiple nodes fit equally, K8s will schedule on the node with the least amount of pods
24
Q

K8s objects

A
  • apiVersion: Each object can be versioned, so the data structure of the object can change between versions
  • kind: the kind of object that should be created
  • metadata: used to identify the object like name
  • spec: the specification of the object
25
Q

Workload Objects in K8s

A

ReplicaSet
- controller Object that makes sure a desired number of pods are running at any given time
- used to scale applications and improve their availability
- they do this by starting multiple copies of a pod definition

Deployment
- most feature-rich object in K8s
- can be used to describe complete application lifecycle
- perfect to run stateless applications in K8s

StatefulSet
- can be used to run stateful applications like databases in K8s
- try to retain IP addresses of pods and give them a stable name, persistent storage and more graceful handling of scaling

DaemonSet
- ensures that a copy of a Pod runs on all (or some) nodes of your cluster
- perfect to run infrastructure-related workload like monitoring or logging

Job
- creates one or more pods that execute a specific task and terminate afterwards
- perfect to run one-shot skripts like database migrations or administrative tasks

CronJob
- add a time-based configuration to jobs, that allows Jobs to run periodically like every hour

26
Q

Networking Objects in K8s

A

Services
- can be used to expose a set of pods as a network service
- there are 4 Service Types:

ClusterIP
- most common service type
- is a virtual IP inside K8s that can be used as a single endpoint for a set of pods
- can be used as a round-robin load balancer

NodePort
- extens the ClusterIP by adding simple routing rules
- opens a port between 30000-32767 on every node in the cluster and maps it to the ClusterIP
- allows routing external traffic to the cluster

LoadBalancer
- extends the NodePort by deploying an external LoadBalancer instance
- will only work if your environment has an API to configure a LoadBalancer instance (f.e. on Hetzner, AWS…)

ExternalName
- special service type that has no routing
- uses k8s internal DNS server to create a DNS alias
- useful if you want to reach external resources from your Kubernetes cluster

Ingress Object
- exposes HTTP and HTTPs routes from outside the cluster for a service inside the cluster
- configures routing rules a user can set and implement with an ingress controller

27
Q

Volumes & Storage Objects in K8s

A
  • k8s made volumes part of a pod
  • volumes allow sharing data between multiple containers in the same pod
  • prevents data loss when Pod crashes and is restarted on the same node
  • only data stored to a volume will be saved

PersistentVolumes (PV):
- abstract description for a slice of storage
- object configuration holds type of volume, volume size, access mode and information on how to mount it

PersistentVolumeClaims (PVC):
- request for storage by a user
- if cluster has multiple volumes, user can create PVC which will reserve a PV according to his needs

28
Q

Configuration Objects in K8s

A
  • applications often have config files or need connections strings to other services
  • in K8s the configuration from the pods is decoupled with a ConfigMap
  • ConfigMaps can be used to store whole configuration files or variables as key-value pairs
  • you can mount a ConfigMap as volume in a pod or map variables from the ConfigMap to environment variables of a Pod
  • secrets for sensitive data
29
Q

Autoscaling Objects in K8s

A

Horizontal Pod Autoscaler (HPA)
- most used autoscaler in K8s
- watches Deployments or ReplicaSets and increases number of Replicas is threshold is reached

Cluster Autoscaler
- can add new worker nodes to the cluster if the demand increases
- works great in tandem with the HPA

Vertical Pod Autoscaler
- relatively new
- allows Pods to increase resource requests and limits dynamically
- is limited by the node capacity

30
Q

Application Delivery Fundamentals

A
  • every applications lifecycle starts with code that is written
  • best way to manage source code is a version control system (Git)
  • next step is building the application (this includes docker images)
  • last step is delivering the application to the platform it should run on
31
Q

CI/CD

A

Continuous Integration
- first part of the process
- described permanent building and testing of the written code

Continuous Delivery
- second part of the process
- automates deployment of the pre-built software
- software is often deployed to Development or Staging environments before it gets released

Pipelines are used for whole automation of this workflow:
- build code
- run tests
- deploy to servers
- perform security and compliance checks

32
Q

GitOps

A
  • Infrastructure as Code
  • merge requests to manage infrastructure changes

Push-based:
- pipeline runs tools that make changes in the platform
- changes can be triggered by commit or merge requested

Pull-based
- agent watches git repository for changes and compares it to the actual running state
- applies changes to the infrastructure when changes were detected
- K8s is well suited for GitOps because of its API

33
Q

Observability

A
  • often used synonymously with monitoring
  • is the system stable?
  • is the system sensitive to change?
  • do certain metrics exceed their limits?
  • why does a request to the system fail?
  • are there any bottlenecks?
34
Q

Telemetry

A
  • distance (tele) measuring (metry)
  • each and every application should have build in tools that generate information data
  • information data should be collected and transferred in a centralized system

Logs
- messages emitted from an application when errors, warnings or debug information should be represented

Metrics
- quantitive measurments taken over time
- f.e. a number of requests or an error rate

Traces
- track progression of a request while it’s passing through the system
- provides information of when a request was processed by which service and how long it took

35
Q

Logging

A
  • many frameworks come with logging tools built-in

Linux programs provide three I/O streams:
- standard input (stdin): Input to a program e.g. via keyboard
- standard output (stdout): The output of a program
- standard error (stderr): Errors of a program

Node-level logging
- most efficient way to collect logs
- admin configures a log shipping tool that collects logs and ships them to a central system

Logging via sidecar container
- application has a sidecar container that collects and ships logs

Application-level logging
- the application itself pushes the logs directly to the central store

36
Q

Prometheus

A
  • open source monitoring system

collects four core metrics:
- Counter: value that increases, like request or error count
- Gauge: values that increase or decrease, like memory size
- Histogram: sample of observations, like request duration or response size
- Summary: similar to histogram, but also provides total count of observations

To expose those metrics, applications can expose an HTTP endpoint under /metrics.

37
Q

Tracing

A
  • used to understand how a request is processed in a microservice architecture
  • consists of multiple units of work
  • each application can contribute a span to a trace which can include start and finish time, name, tags or a log message
38
Q

Cost Management

A
  • cloud providers don’t offer “pro-bono”
  • analyze what is really needed and automate the scheduling of resources needed

Identify wasted and unused resources
- with good monitoring its easy to find unused resources
- autoscaling helps to shut down instances that are not needed

Right-Sizing
- when starting out, it can be a good idea to choose servers with lot more power than actually needed
- good monitoring give good indications over time how much actually is needed
- ongoing process to always adapt to the load you actually need

Reserved Instances
- On-demand pricing models are great if you really need resources on-demand
- if not, you’re paying a lot for “on-demand” service
- you can reserve resources and pay them upfront if you can estimate the resources you will need

Spot Instances
- use spot instances for heavy batch of jobs or load for a short amount of time
- you get unused resources that have been over-provisioned by the cloud vendor for very low prices
- spot instances can be terminated since these are not reserved for you but for the one who paid the full price