Lecture Two - System Availability Flashcards
Information Technology (IT):
Encompasses technologies related to storage, retrieval, manipulation, and communication of information.
Includes computers, networks, phones, and fax machines.
Infrastructure Definition:
The underlying framework or features of a system or organization.
Fundamental facilities and systems serving a country, city, or area, such as transportation and communication systems.
Benefits of IT Infrastructure
Commonly Accepted Benefits:
Automates manual activities.
Handles increased volumes of data efficiently.
Extends the range of tasks that can be performed.
Enhances customer service quality.
Increases the quality of finished products.
Improves information sharing and manipulation capabilities.
Components of IT Infrastructure - Elements of IT Infrastructure
Business Process: Operations that support business goals.
Information and Data: Key resources for decision-making.
Applications and Servers: Software and hardware systems.
Buildings and Electricity Providers: Physical and power resources.
Hardware and Software: Essential computing equipment and programs.
Data & Storage: Systems for data management and retention.
Network Services: Connectivity and communication services.
IT System Model - System Layers
Process/Information: Core business processes and data handling.
Applications: Software tools and systems.
Application Integration: Ensures seamless operation and data flow.
Infrastructure: Physical and virtual resources.
IT System Model - Considerations
Availability: System uptime and reliability.
Performance: Efficiency and speed of operations.
Security: Protection against threats and vulnerabilities.
End User Devices: Interfaces for user interaction.
Operating Systems, Servers, Networks, Virtualisation, Data Centres: Core components for IT operation.
System Availability -
Availability%=( MeasuredTimePeriod/Uptime
)×100
System Availability and SLAs - Common Availability Levels
99.0%, 99.9%, 99.95% typically specified in SLAs.
99.999% known as carrier-grade availability.
System Availability and SLAs - Downtime Estimates
99.8%: 17.5 hours/year, 86.2 minutes/month, 20.2 minutes/week.
99.9%: 8.8 hours/year, 43.2 minutes/month, 10.1 minutes/week.
99.99%: 52.6 minutes/year, 4.3 minutes/month, 1.0 minute/week.
99.999%: 5.3 minutes/year, 25.9 seconds/month, 6.1 seconds/week.
Unavailability Intervals - Definition
Used in conjunction with availability percentage to define acceptable downtime.
Example for 99.9% Availability:
525 minutes of downtime/year should not occur as a single event.
Downtime can be spread across many short events.
Unavailability Intervals - Interval Specifications
0 - 5 minutes: ≤ 35 times/year
5 - 10 minutes: ≤ 10 times/year
10 - 20 minutes: ≤ 5 times/year
20 - 30 minutes: ≤ 2 times/year
> 30 minutes: ≤ 1 time/year
Estimating System Availability - SLAs
Provide upfront availability guarantees; actual availability is computed afterward.
Estimating System Availability - Estimation Factors
Mean Time to Repair (MTTR): Average time to repair/recover failed components.
Mean Time Between Failures (MTBF): Average time between failures.
Estimating System Availability - Timeline
Failure: Time when a system component fails.
Recovery: Time taken to repair the system.
MTTR and MTBF: Key metrics for assessing system reliability.
Estimating Availability with MTBF and MTTR
EstimatedAvailability%=(
MTBF/(MTBF+MTTR))×100
Observed Availability and Failures
Failure Probability: Changes over time, typically following a bathtub curve.
Failure Phases:
Early Failures: Initial phase with higher failure rates.
Random Failures: Stable phase with constant failure rate.
Wear-Out Failures: Increased failures as components age.
Observed Availability: Influenced by component reliability and failure rates.
Multi-Component Availability
Comprise multiple components, each with its availability.
A (system) = A1 x A2 x A3… where A1, A2, A3… are the availabilities of the individual components.
System Availability and Components - Graphical Representation
System availability decreases as the number of components increases.
Visualizes availability for different component reliability (99%, 95%, 90%).
System Availability with Multiple Components - Insight
Increasing the number of components increases the likelihood of system failures.
Redundancy in IT Systems - Purpose
Improves system availability and robustness by duplicating components/functions.
Acts as a backup to mitigate failures.
Redundancy in IT Systems - Cost Implications
Pros: Enhances reliability and reduces downtime.
Cons: Increases overall system cost.
Parallel System Availability
Availability improves as the number of systems/components in parallel increases.
Formula -
A (Parrallel) = 1 - (1-A)^m
A is the availability of a single system/component, m is the number in parallel.
Business Continuity
Disaster Events: Potential incidents like fires, natural disasters, or social unrest.
Preparedness: Businesses must prepare for contingencies to ensure continuity.
Disaster Recovery Plan (DRP):
Outlines procedures to protect and recover IT infrastructure.
Ensures minimal disruption and swift recovery from incidents.
Business Continuity Concepts
Downtime and Data Loss Metrics:
Recovery Time Objective (RTO):
Time needed to restore a business process.
Indicates the maximum allowable downtime.
Recovery Point Objective (RPO):
Data freshness required for recovery.
Commonly set at 24 hours, dictating data lost between last backup and incident.
Trends in IT Infrastructure
Emerging Trends:
Cloud Computing
Bring Your Own Device (BYOD)
Green IT
Big Data Analytics
Cloud Computing
Paradigm Shift: Enabled by virtualization technologies.
Shared Resources: Applications utilize resources from a virtualized pool.
On-Demand: Resources scale up/down based on demand.
Deployment Models:
Public Cloud: Available to the general public.
Private Cloud: Dedicated to a single organization, managed internally or by a third party.
Hybrid Cloud: Combines public and private models.
Cloud Computing Service Models
Service Models:
Software-as-a-Service (SaaS): Provides software applications over the internet.
Platform-as-a-Service (PaaS): Offers hardware and software tools over the internet.
Infrastructure-as-a-Service (IaaS): Delivers computing infrastructure over the internet.
Future Lecture Topic: Further details on Cloud Computing to be covered later.
Bring Your Own Device (BYOD) - Trend Description
Employees using personal devices (smartphones, tablets, laptops) for work.
Access organizational applications and data for information processing and communication.
Benefits of Bring Your Own Device (BYOD)
No commitment to maintaining devices.
Low capital expenditure.
Agile decision-making and minimal operational expenditure.
Familiarity increases efficiency and productivity.
Potential cost savings from using employee contracts.
Risks of Bring Your Own Device (BYOD)
Security and Management Risks:
Multiple devices not fully controlled by system managers.
Data privacy and confidentiality concerns.
Potential malware carriers from external networks.
Ownership Concerns:
Employee-owned phone numbers might be dialled by business contacts.
Green IT
Environmental Goal: Reduce the environmental impact of IT infrastructures.
Strategies:
Reduce electricity usage and CO2 emissions.
Use greener equipment and increase efficiency.
Examples:
Flash disks instead of rotating disks.
Blade servers sharing power supplies to reduce consumption.
Power Usage Effectiveness (PUE)
Efficiency Metric:
Measures the energy efficiency of a data centre.
PUE= ITEquipmentEnergy/TotalFacility Energy
Power Usage Effectiveness (PUE) - Interpretation
PUE of 2.0 indicates that for every watt of IT power, an additional watt is used for cooling and distribution.
PUE closer to 1.0 indicates higher efficiency, with more energy used for computing.
Big Data Analytics - Data Generation
Generated by ubiquitous sensors, mobile telephony, surveillance cameras, RFID tags, and social networks.
Big Data Analytics - Data Characteristics
Raw data is often unstructured.
Investments focus on managing and maintaining large datasets.
Big Data Analytics
Analyzing large datasets to discover meaningful patterns, trends, and associations.
Enhances decision-making and strategic planning.