Resource Management I - Week 4 Flashcards
Resources in distributed systems
Hardware
Cpu, storage, network links
Software
Web services
Data
- Datasets
Applications need different shares of resources. Distributed systems need to optimise based on the limited number of resources and the applications requests for them.
Centralized computing
Primarily used until the 90s. about 200K hosts connected to the internet in 1990.
Data transmission in the order of Kb/s
Still used distribution for compute-intensive applications. E.g. Long Integer Factorisation
The application was written in C, would run on a machine when idle and use email to communicate with a server; sending results of requesting data:
- factorising 100-digit integers would take months, with a good implementation the distributed technique would take just a few days.
Grid computing
Grid concept and hardware was too heavyweight
CERN is the main user, data generated in the order of 1PB/s
By 2010 most people talking about cloud computing, GRID didn’t seem to take off
Cloud computing
Largely evolved from the grid
Relies on virtualisation of resources that create virtual machines as opposed to physical machines
Idea is on demand resource provisioning - users don’t own the resources they are provided by providers as a service
Elasticity
Resources the can be added on demand
Horizontal: add or remove new instances
Vertical: Increase or decrease resource attributes
Software as a service (SaaS)
The cloud provider hosts some software which is provided to users.
E.g. Microsoft Office 365, DropBox
Platform as a Service (PaaS)
Cloud provider offers an environment that allows the user to create applications (e.g. access to some programming functionality, databases, etc…) E.g: Google Compute Engine
Infrastructure as a Service (IaaS)
Cloud provider offers pas-as-you-go access to storage, networking and other resources.
Can’t classify everything on the cloud, e.g. Hadoop + mapReduce
Remember this
Cloud provider
The owner of resources:
Objectives:
- Maximise income
- Minimise costs (e.g. energy)
Needs to allocate virtual machines to physical machines
- Initial allocation (placement)
- Reallocation
Cloud consumer
Objective:
- Obtain resources to complete a task as cheaply (and quickly) as possible
Needs to choose what resource to rent (resource provisioning):
- Overprovisioning: too many resources for the needs of the task
- Underprovisioning: resources are not sufficient for the task
Virtual Machine (VM) Allocation
Done by the cloud provider
Many virtual machines (VMs) need to share physical machines (PMs) need to meet objectives:
- Service level agreement with the user (translates to money)
- Maximise resource utilisation
- Minimise energy costs
Needs to do:
- Initial placement and reallocation, when to reallocate?
Mapping VMs to PMs is a more complicated bin packing problem
Volunteer Computing
Software that runs in the background of networked resources, downloading tasks and sending the results back. E.g. finding prime numbers.
Concern is it’s not very safe.