L8 - Cluster Management Flashcards
What is resource allocation?
how much CPU/DRAM/disk/net to allocate to each app
What is resource assignment?
What should run on which physical nodes?
What is private resource allocation? What is its other name?
each app receives a private, static set of resources
static partitioning
Advantages of static partitioning?
- simplicity
- performance isolation
- allows specialised HW (e.g. not everyone needs a GPU)
Disadvantages of static partitioning?
- low utilisation
- hard to solve failures
- hard to maintain
about 2&3: not clear how to migrate a machine
What 3 properties do we want the scheduler to fulfil in case of shared resource assignment?
- Fairness
- Efficient resource usage
- Isolation
List the algorithm from the lecture for shared resource assignment.
- Fair queueing (extends 1) (for a single resource)
- Weighted max-min fair queueing (extends 2)
- Dominant resource fairness
- Token bucket
What does work conserving mean? Which property implies that?
Resources should not remain idle while there are users whose demand is not fully satisfied.
This is implied by “Efficient resource usage”.
Why do we want work conserving schedulers?
It keeps resources well-utilised.
It maximises overall throughput across different users.
Name a strategy that is not work conserving.
time division multiplexing
What are the different notions of fairness?
- Max-min fairness
- Dominant resource fairness
What are the properties of max-min fairness?
share guarantee: each user gets at least 1/n of the unless their demand is less
strategy-proof: users are not better off by asking for more than they need
What does DRF try to achieve?
identify the dominant resource share of each user and maximise the minimum dominant share across all users
What is the drawback of DRF?
not work conserving
What is the issue with max-min fairness?
With max-min fairness, a user’s allocation depends on the demands of other users that are sharing the resource. –> no performance predictability
What is the goal of token buckets?
guarantee a baseline bandwidth, but also allow bounded bursts
How does the token bucket idea work?
Control traffic by delaying requests until they accumulate sufficient tokens.
What does resource assignment try to optimise?
- performance
- resource utilisation
Explain the first step of resource assignment.
Filter machines that satisfy hard constraints
e.g., VM may need a machine with a GPU
Explain the second step of resource assignment.
Rank candidate nodes to find machine that best
satisfies soft constraints
e.g., best-fit to avoid resource fragmentation
List different methods for cluster management system architecture.
- centralised
- distributed
- hierarchical e.g. two-level
Next questions are about Borg. First, what is Borg?
Google’s centralised cluster manager
What does Borgmaster do?
It is the main scheduler.
It polls Borglets every few seconds
extra: 5 replicas
What does Borglet do?
Manages and monitors tasks and resources on machines it is responsible for.
extra: 10k heterogenous machines per Borglet
What strategies does Borg deploy to achieve high utilisation?
- admission control
- efficient task-packing
- over-commitment
- machine sharing
What is Kubernetes?
Cluster management for containerised applications;
- manage complexity of container lifecycle and allocating/setting up hardware resources for the containers.
- like an OS for your cloud cluster
List container orchestration primitives!
- Resource scaling
- Resource allocation
- Load balancing
- Lifecycle and health
- Naming and discovery
- Storage volumes
- Logging and monitoring
- Debugging and introspection
- Identity and authorization
Resource scaling
make sets of containers bigger or smaller