L8 - Cluster Management Flashcards
What is resource allocation?
how much CPU/DRAM/disk/net to allocate to each app
What is resource assignment?
What should run on which physical nodes?
What is private resource allocation? What is its other name?
each app receives a private, static set of resources
static partitioning
Advantages of static partitioning?
- simplicity
- performance isolation
- allows specialised HW (e.g. not everyone needs a GPU)
Disadvantages of static partitioning?
- low utilisation
- hard to solve failures
- hard to maintain
about 2&3: not clear how to migrate a machine
What 3 properties do we want the scheduler to fulfil in case of shared resource assignment?
- Fairness
- Efficient resource usage
- Isolation
List the algorithm from the lecture for shared resource assignment.
- Fair queueing (extends 1) (for a single resource)
- Weighted max-min fair queueing (extends 2)
- Dominant resource fairness
- Token bucket
What does work conserving mean? Which property implies that?
Resources should not remain idle while there are users whose demand is not fully satisfied.
This is implied by “Efficient resource usage”.
Why do we want work conserving schedulers?
It keeps resources well-utilised.
It maximises overall throughput across different users.
Name a strategy that is not work conserving.
time division multiplexing
What are the different notions of fairness?
- Max-min fairness
- Dominant resource fairness
What are the properties of max-min fairness?
share guarantee: each user gets at least 1/n of the unless their demand is less
strategy-proof: users are not better off by asking for more than they need
What does DRF try to achieve?
identify the dominant resource share of each user and maximise the minimum dominant share across all users
What is the drawback of DRF?
not work conserving
What is the issue with max-min fairness?
With max-min fairness, a user’s allocation depends on the demands of other users that are sharing the resource. –> no performance predictability
What is the goal of token buckets?
guarantee a baseline bandwidth, but also allow bounded bursts
How does the token bucket idea work?
Control traffic by delaying requests until they accumulate sufficient tokens.
What does resource assignment try to optimise?
- performance
- resource utilisation
Explain the first step of resource assignment.
Filter machines that satisfy hard constraints
e.g., VM may need a machine with a GPU
Explain the second step of resource assignment.
Rank candidate nodes to find machine that best
satisfies soft constraints
e.g., best-fit to avoid resource fragmentation
List different methods for cluster management system architecture.
- centralised
- distributed
- hierarchical e.g. two-level
Next questions are about Borg. First, what is Borg?
Google’s centralised cluster manager
What does Borgmaster do?
It is the main scheduler.
It polls Borglets every few seconds
extra: 5 replicas
What does Borglet do?
Manages and monitors tasks and resources on machines it is responsible for.
extra: 10k heterogenous machines per Borglet
What strategies does Borg deploy to achieve high utilisation?
- admission control
- efficient task-packing
- over-commitment
- machine sharing
What is Kubernetes?
Cluster management for containerised applications;
- manage complexity of container lifecycle and allocating/setting up hardware resources for the containers.
- like an OS for your cloud cluster
List container orchestration primitives!
- Resource scaling
- Resource allocation
- Load balancing
- Lifecycle and health
- Naming and discovery
- Storage volumes
- Logging and monitoring
- Debugging and introspection
- Identity and authorization
Resource scaling
make sets of containers bigger or smaller
Resource allocation
decide where my containers should run
Load balancing
distribute traffic across a set of containers
Lifecycle and health
keep my containers running despite failures
Naming and discovery
find where my containers are now
Storage volumes
provide data to containers
Logging and monitoring
track what’s happening with my containers
Debugging and introspection
enter or attach to containers
Identity and authorization
control who can do things to my containers
What do the Kubernetes containers do?
Handle package dependencies
What is a pod?
A pod is the unit of scheduling and migration in Kubernetes.
a bunch of containers with same properties
List those properties!
- Lifecycle: live together, die together
- Network: same IP address, same routes, iptables
- Storage volumes: can share data
- Intended to run a common task
Kubernetes service?
A group of pods that work together
extra: provides load balancing among pod replicas
How do you control pod placement in Kubernetes?
use labels and selectors
How do you keep N pods running?
use ReplicaSets: layer on top of Pod API that
ensures N copies of a pod are running
What does the Horizontal Pod Autoscaler do?
automatically scale pods as needed
- based on CPU utilisation (or custom metrics)
- can set user-defined min/max bounds
What is a potential problem with relying only on CPU utilization as a scaling metric?
good for compute bound apps but maybe I/O is the bottleneck
What other metrics would you consider for auto-scaling besides CPU utilization?
- memory capacity
- memory BW
- network BW
What properties does resource isolation try to achieve?
- Applications must not be able to affect each other’s performance
- Repeated runs of the same application should see similar behaviour
What are the resource allocation mechanisms in Kubernetes?
Request: How much of a resource (CPU, RAM) the container is asking to use, with a strong guarantee of availability
Limit: Max amount of a resource the container can access
Does the scheduler overcommit to requests?
No.
List 3 Kubernetes Quality of Service classes.
- Guaranteed: highest protection
- Burstable: medium protection
- Best effort: lowest protection
Relation of request and limit for Guaranteed class?
request > 0 && limit == request
Relation of request and limit for Burstable class?
request > 0 && limit > request
Relation of request and limit for Best effort class?
request == 0
What are the advantages of centralised design?
can make globally optimal decisions
What are the drawbacks of centralised design?
scalability: hard to enforce consistency
Name 2 two-level cluster managers
Mesos and YARN
How does Mesos work?
Lecture on 05.04
Min: 3.5
List two distributed cluster management algorithms.
Omega and Sparrow
List two new challenges serverless brings to the cluster management besides resource allocation and assignment.
- resource scaling: How many containers (“slots”) to keep warm for a function?
- request routing: To which node and “slot” do we send a particular invocation?
What does Quasar try to solve?
Over-provisioning
How does Quasar solve over-provisioning?
Don’t ask users for allocation request/resource demand.
They don’t really know it anyway.
What do the users specify in this case? (Quasar)
performance goals
What does the cluster manager do in this case? (Quasar)
profiles applications and dynamically adjusts resource allocations
How does the cluster manager understand resource/performance tradeoffs? (Quasar)
It combines the following:
- Small signal from a short run of a new application
- Large signal from previously run applications
What does the cluster manager do at the end? (still Quasar)
For each new application, it needs to recommend a resource allocation and assignment.
How does one build a recommender system?
collaborative filtering
What is collaborative filtering?
Predict preferences of new users given preferences of other users SVD and PQ reconstruction.
What needs to be considered to recommend resource allocations to applications? (4)
- scale-out
- scale-up
- HW heterogeneity
- Interference
What does scale out mean?
Use 4 nodes or a single node?
What does scale up mean?
Use a 8-core VM or a single core VM?
List 3 steps of Quasar’s functionality.
Step 1: short profiling runs produce initial performance data.
Step 2: collaborative filtering techniques fill in missing data
Step 3: Greedy scheduler uses output to find the number and type of resources that maximise utilisation and performance.
To summarise, what are the challenges of using shared clusters?
- Resource allocation: how many resources should an app get?
- Resource assignment: which specific resources does an app get?
- Variability: within an app (different phases), within datasets, and load