Exam Questions Flashcards
Define Grid Computing
A Grid coordinates resources that are not subject to centralized control using standard, open, general purpose protocols and interfaces to deliver nontrivial qualities of service.
Explain the 3 point check list
1) Coordinate resources that are not subject to centralized control
- integrates and coordinates resources and users are in different control domains
- different administrative units of one or different companies
- address: security, policy, payment, membership,
- SPPM
2) Using standard, open, general-purpose protocols and interfaces
- a grid is built from multi-purpose protocols and interfaces
- important because otherwise we are dealing with a application-specific system
- address: authentication, authorization, discovery, access
- DAAAc
3) To deliver nontrivial qualities of service
- resources can be used in a coordinate fashion
- offers various qualities of service (SLA)
- response time, throughput, availability, security
- the utility combined system is greater than that of the sum of its parts
- address: response time, throughput, availability, security.
Can we apply the three point check list on Clouds?
No, not all points. A cloud normally manage its resources in a centralized fashion, e.g. a cloud provider host all the cloud service inside a data center which is totally under his control. This is different from Grid, where we deal with a distributed resource management.
The second point also does not apply for Cloud, because in Cloud Computing we still have the data lock-in problem - APIs, for Cloud Computing still proprietary. It is hard to extract data and programs from one site to run on another.
The only point that is shared between Grid and Cloud computing is the third one. Both definitions agree on the requirement of a certain level of quality of service or the negotiation of service level agreements (SLAs), but additionally Cloud Computing should be able to adjust to a required level of QoS.
Can you give an example of a grid used in science? And in business?
Grids are important for science nowadays because the huge amount of data the experiments produce. Some examples are: the LHC in CERN, monitoring of industrial equipment (airplane flights), and high-throughput sensor networks.
In business, we notice that IT industry is evolving to a mass adoption stage: people begin to adopt a kind of post-technology perspective. We have two grid examples:
- GDI-Grid: efficient mining and processing of spatial data for simulation of noise dispersion, flood simulation and disaster management.
- EGEE: provide large computational and storage resources.
Give a overview of each layer: FABRIC
FABRIC: interfaces for local control. Provides the resources to which shared access is mediated by grid protocols. Implement local, resource-specific operations that occur on specific resources - physical or logical.
Richer fabric functionality enables more sophisticated sharing operations. However, if we place few demands on Fabric elements, then deployment of Grid infrastructure is simplified.
- Introspection/Enquire mechanisms: permit discovery of their structure, state, and capabilities (support advanced reservation?);
- Resource Management: provide some control of delivery of QoS.
Resources:
- Computational: start programs, monitor and control the execution. Enquire load, queue, software characteristics.
- Storage: put/get files. High performance transfers. Control of data transfers: disk space, bandwidth, network, …
- Network: control over the resources allocated to network transfers (prioritization/reservation)
Give a overview of each layer: CONNECTIVITY
CONNECTIVITY: talking to things. Defines core communication and authentication protocols.
- Communication protocols enable the exchange of data between fabric layer resources (~ TCP/IP).
- Authentication protocols build on communication services to provide cryptographically secure mechanisms for verifying the identity of users and resources. Should provide:
- SSO: users must logon (authenticate) just once and then have access to multiple resources without further user intervention
- Delegation: a user must be able to endow a program with the ability to run on that user’s behalf, so that the program is able to access the resources on which the user is authorized (Possible limited; chain).
- Integration with various local security solutions
- User-based trust relationships
PKI/GSI
Give a overview of each layer: RESOURCE
RESOURCE: sharing single resources. Concerned entirely with individual resources - ignore issues of global state. Protocols for secure negotiation, initiation, monitoring, control, accouting, and payment of sharing operations on individual resources.
- Information protocols: used to obtain information about the structure and state of a resource - config, load, usage, policy, cost, …
- Management protocols: used to negotiate access to a shared resource. Specify resource requirements (advanced reservation and QoS) and the operations to be performed (process creation or data access).
Since they are responsable for instantiating sharing relationships, they must serve as a “policy application point” –> ensure that the requested protocol operations are consistent with the policy under which the resource is to be shared.
Give a overview of each layer: COLLECTIVE
COLLECTIVE: coordinating multiple resources, not associated with any one specific resource. Spam from general purpose to highly application/domain specific. Capture interaction across collections of resources.
Share behavior: directory services; co-allocation, scheduling and brokering service; monitoring and diagnostics; data replication.
Address security, policy and accounting issues: community authorization services (CAS - enforce community policies) and community accounting and payment.
Open Grid Service Architecture.
Give a overview of each layer: APPLICATION
APPLICATION: execute grid applications. Grid service compositions(workflows) combine grid services with new grid applications. User applications constructed by utilizing the services defined at each lower level. Each of the previously layer must provide API and SDK for the higher layers integration.
Defined Cloud Computing
Clouds are a large pool of VIRTUALIZED RESOURCES. These resources (hardware, development, platform, services, …) can be DYNAMICALLY RECONFIGURED to adjust to a variable load, exploited by a PAY-PER-USE MODEL, and they offer guarantees by means of CUSTOMIZED SLAs.
What are the enablers for Cloud Computing?
Cloud computing has two main enablers: the first is the technological: secure and efficient HARDWARE VIRTUALIZATION technology. The second enabler is economical: economies of scale in infrastructure; lightweight, contract-less, inexpensive computing.
What is the main technology behind the cloud?
The HYPERVISOR - it allows multiple guest operating systems to run concurrently on a host computer. It presents a virtual operating platform to the guest OS and monitor the execution of that guest OS. Multiple instances of a variety of OSs may share the virtualized hardware resources.
Which techniques of virtualization exist?
Four main techniques: hardware-assisted virtualization; full virtualization; paravirtualization and operating system assisted virtualization.
Techniques of virtualization: hardware-assisted virtualization
Any critical operation can be detected on the fly, because the hardware allows the automatic trapping of all critical operations. Therefore, no scanning is needed a priori.
Techniques of virtualization: full virtualization
All executable code has to be scanned for critical operations- All of them have to be replaced by a trap that needs to be stored in a special directory such that the real operation can be retrieved if the trap is executed. Now the hyper-visor has to emulate the behavior expected by the virtual machine.