Block 4 part 1 Flashcards

Question 1

Q

what is the key metric availability and how do we calcul it

Answer

A

availability is the probability that an application,service,system is available to use

A= uptime ÷ (uptime + Downtime)

Question 2

Q

Give me examples of planned downtime

Answer

A

Backup and restauration , hardware os network upgrades , application and db maintenance

Question 3

Q

give me examples fo unplanned downtime

Answer

A

environmental factors, app errors, operator and user errors

Question 4

Q

What is the primary cause of downtime in data centers

Answer

A

Ups system failure

Question 5

Q

what is the average annual loss per company according to size

Answer

A

small 221 817
Medium 450 000
Large 927 823

Question 6

Q

How do we calculate reliability

Answer

A

check slide 13
reliability is the ability for a system to perform its fucntiin ynder conditions in a specified period of time

Question 7

Q

what is mean time tonfailure MTTF

Answer

A

measure of reliability for items that cannot be repaire
MTTF = (test period x num of item under test) ÷ num of items that fail

see example on slide 14

Question 8

Q

what is annualised failure rate AFR

Answer

A

afr = (num of failures × 8760 hours)÷ MTTF hours × 100%

Question 9

Q

Howbdo we calculate how much drive will fail in 1 year using AFR

Answer

A

num failures = num of drives × AFR

Question 10

Q

What is inherent availability and how do we calcul it

Answer

A

inherent availability is the availability of a system that has not been created

Ai = MTTF ÷ (MTTF + MTTR)

mttr being the mean time to replace (time)

Question 11

Q

what is operational availability

Answer

A

Ao = MTBM (mean time between maintenance) ÷ (MTBM + MDT (mean downtime))

Question 12

Q

How do we increase availability

Answer

A

Load sharing -> sharing workload accross a number of computers.

the internet send a load to the load share monitor that would distribute it to multiple nodes

Question 13

Q

What are the disadvantages of load sharing and the solutiond

Answer

A

the monitor cannot track the responses if a node does fail
monitor represents a single point of failure
updates to each node independently
no guarantee that multiple request from a client are directeed to the same node

Solution:
incorporate cookies
add a form of a shared storage

Question 14

Q

what is heartbeat in load sharing

Answer

A

a small message that is communicated from the node tk monitor and if its not communicated the monitor will assume failure

Question 15

Q

What is clustering

Answer

A

collection of independent computer nodes as single logicsl server to user
its goal is to increase availability

there is two form active - active (start copy of application) / active - passive (takeover in failure)

Question 16

Q

what is fault tolerance

Answer

A

a hardware continues to operate in the event of a single hardware failure

Question 17

Q

what are the potential benefits of virtualisation

Answer

A

reduce cost of ownership as two virtual servers operate of on computer
additional protection by executing untrusted app on guest os (sandboxing)
Legacy system , emultatecolder peripherals that are no longer manufactured
quick instalation
quick recovery

Question 18

Q

What is cloud computing

Answer

A

conbine high availability hardware with virtual servers

Infrastructure as a service : business with a complete set of computers

Platform as a service : business with computer platform

Software as a service : business with an entire application

does not increase availability -better utilisation of perf of processors

Question 19

Q

what is disaster recovery

Answer

A

putting in place a plan that wil enable a company to recover its it system from a disaster - enable to functiom during a disadter

it have to describe:
what activities should be done
who should do the activitued
when and in what sequence they should be done