FInal Exam Flashcards
This is it
What is Kubernetes?
An open source version of Google’s Borg
What is Kubernetes job?
To manage container clusters
Can Kubernetes support multiple infrastructures?
Yes
Can Kubernetes support multiple containers running?
Yes
What are the four basic objects in Kubernetes?
Pod
Volume
Service
Namespace
What is the main role of a Pod in Kubernetes?
Basic deployment unit
What is the main role of a Volume in Kubernetes?
Persistent storage
What is the main role of a Service in Kubernetes?
Group of pods that work together
What is the main role of a Namespace in Kubernetes?
Logical slices of the Kubernetes cluster
Does a pod contain one or many (different) containers?
A pod can have one or many containers
What are some interesting features of a pod?
Co-scheduling
Localhost
Persistent Storage
What are the three multi-container models in Pod?
Sidecar, Ambassador and Adaptor
What is pod co-scheduling in Kubernetes?
Containers in a pod must be scheduled together on the same node
What is the sidecar’s role?
To be a helper
What is the ambassador’s role?
To be a proxy
What is the adapter’s role?
To be a common output interface
What is etcd’s responsibility in Kubernetes?
To store the key/value pair, all cluster states and it is the primary target of backups
Why is etcd important for failure recovery?
The cluster can be restored with etcd
What are controllers in Kubernetes?
Thing that create and manage the four objects
What is autoscaling in Kubernetes?
The increasing and decreasing of clusters and pods
What Prometheus in Kubernetes?
A third party framework that monitors Kubernetes
What are the three directions autoscaling
Vertical and Horizontal and Multidimensional
What does the Kubernetes controller ReplicaSet do?
Makes a stable number of pods
What does the Kubernetes controller Deployment do?
Supports rolling back to different versions
Supports upgrading Kubernetes
What does the Kubernetes controller DaemonSet do?
Monitoring and logging Daemon
What does the Kubernetes controller Job do?
Runs to complete
What is the work order of the Horizontal Pod Autoscaler?
HPA 1: Read metrics ->2: Threshold is reached -> 3: Change # of replicas -> 4: Scale in and Out pods
What is the Vertical Pod Autoscaler?
VPA 1: Read metrics ->2: Threshold is reached -> 3: Change CPU/MEM values -> 4: Adjust resource allocation
What is the Multidimensional Pod Autoscaler?
MPA
What is the Cluster Autoscaler?
CA
Prometheus’s data model is what?
A time series
What is serverless computing?
A cloud execution model where the provider dynamically manages the infrastructure, allowing developers to focus on writing code that runs in response to events, with automatic scaling and pay-per-use pricing.
Why did people make server less computing?
To reduce over provisioning
To change from big resource models to smaller ones
Serverless computing is what?
Event based
What is a warm start?
A function is already deployed
What is the order of big resource models to small ones?
Server -> VM -> Containers -> Functions
What is a container timeout?
The amount of time a container can stay running without an application before its closed
True or false: Serverless computing is a single language framework
False: it is typically polyglot
What occurs when you have a cold start?
You find a VM or create one
You do not have to pay for what in serverless computing?
Idle time
What is application timeout?
The amount of time an application can stay running
What is a cold start?
The first execution of the program
What is the steps in a cold start?
Find host VM -> Load a container -> Function is loaded -> a Response is given
What is the relationship between cold starts and wasted memory
The more cold starts the less wasted memory
What is serverless computing good at?
Avoiding over provisioning
No infrastructure management
Hiding underlying infrastructure
Scalability/Concurrency
True on demand cost
Never pay for idle resource
Near unlimited computing resources
What is the negative of Serverless computing?
Very new
Limited resources and execution duration
Vendor lock in
Stateless
How do you locally store files in serverless computing?
Use S3 or Azure Blob store
What is Structured Data?
Data that can be represented in a table with schema
What is Unstructured Data?
Data that is not organized in a pre-defined manner
What is Semi-structured Data?
Cannot be stored in RDBMS, but has organizational properties
What is the BLOB Storage Model?
A Flat object model for storing data
What are the three APIs for BLOB storage?
Put, Get, Delete
What does BLOB store?
Unstructured data
What is BLOB good at?
Highly scalable
Automatic Backup replica management
What are the five design assumptions used to design GFS?
System built from many inexpensive commodity machines (prone to failure)
System stores modest number of large files
Supporting three Google specific workloads
Concurrent, atomic append
Stable bandwidth is much more important than low latency
What are the Google specific workloads?
Large stream read
Small random read
Many large sequential append
No random write
What is a typical example for a large stream read?
Crawled data processing
What is a typical example for a small random read?
Read small pieces from large data
What is a typical example for a large sequential append?
Append search index with new context
What is the reason for not supporting random write operations?
Simplicity in FS design
Simplicity in failover and data management
What is the GFS architecture?
One master with many chunk servers and many clients
What does the clients do in the GFS?
Run programs that access data in chunk servers
What does the master contain in the GFS architecture?
Has a main controller and meta data
What does the chunk server do in the GFS architecture?
Store data
What resource in a computer system at GFS master stores a unique handler for a data chunk?
The master’s memory
What is a chunk?
A FS data block
What is the default chunk size in GFS?
64 mb
What are the pros of a 64mb chunk?
Large chunk size == small number of chunks
Reduce size of metadata stored in meme space of GFS master
Reduce # of operations between clients and master
Many operations on a given chunk
What are the cons of a 64mb chunk?
Waste storage space due to internal fragmentation
High overhead when handing many small files
Why does the GFS client not have client side caching?
Data is too big to cache
What are the two requests that the GFS client handles?
Control request to master
Data access request to Chunkservers
What is HDFS?
Hadoop distributed file system
Opensource implementation of GFS
What is the master in HDFS?
The name node
What represents the chunk server in HDFS?
Data node
What is MapReduce?
Spliting a large dataset into smaller subsets to do computation over it
What are the two operations in MapReduce?
Map operation
Reduce operation
What is the Map operation procedure?
Takes a series of key/value pairs, generate intermediate key/value pairs
What is the Reduce operation procedure?
Process key/value pairs from Map operations
Generate new output