Applying site reliability engineering principles to a service Flashcards
What is the primary goal of site reliability engineering (SRE)?
The primary goal of SRE is to ensure that a service is highly available, performant, and scalable.
What is the difference between availability and reliability?
Availability refers to the percentage of time a service is operational, while reliability refers to the ability of a service to perform its intended function.
What is the “four nines” standard for availability?
The “four nines” standard for availability is 99.99%, which means that a service should be operational 99.99% of the time.
What is the difference between a service level objective (SLO) and a service level agreement (SLA)?
A SLO is a target availability level set by the SRE team, while an SLA is a contract with an external customer or service provider.
What are the three main components of an site reliability engineering (SRE) workflow?
The three main components of an SRE workflow are incident management, service improvement, and capacity planning.
What is incident management in SRE?
Incident management in SRE is the process of identifying, triaging, and resolving service disruptions.
What is service improvement in SRE?
Service improvement in SRE is the process of identifying and addressing service-related issues to improve overall performance and reliability.
What is capacity planning in SRE?
Capacity planning in SRE is the process of forecasting and managing the resources needed to support a service’s growth and performance.
What is the difference between reactive and proactive incident management?
Reactive incident management is the process of addressing service disruptions after they occur, while proactive incident management is the process of identifying and addressing potential service disruptions before they occur.
What is a post-incident review?
A post-incident review is a process of analyzing the root cause of a service disruption and identifying ways to prevent it from happening again.
What is a service-level indicator (SLI)?
A service-level indicator (SLI) is a metric used to measure the performance and availability of a service.
What is a service-level objective (SLO)?
A service-level objective (SLO) is a target availability level set by the SRE team.
What is a service-level agreement (SLA)?
A service-level agreement (SLA) is a contract with an external customer or service provider.
What is the difference between a service-level indicator (SLI) and a service-level agreement (SLA)?
A SLI is a metric used to measure the performance and availability of a service, while an SLA is a contract with an external customer or service provider.
What is a service-level target (SLT)?
A service-level target (SLT) is a specific availability level that a service is expected to meet, as defined by the SLO.