Service Level Objectives Flashcards

1
Q

What is an SLI?

A

Service Level Indicator - a carefully defined quantitative measure of some aspect of the level of service that is provided.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are some examples of an SLI?

A

Latency, Error rate, or System Throughput

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is request latency?

A

How long it takes to return a response to a request

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is availability?

A

The fraction of the time that a service is usable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How is availability defined?

A

In terms of the fraction of well-formed requests that succeed, sometimes called yield.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is durability?

A

The likelihood that data will be retained over a long period of time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is an SLO?

A

Service Level Objective - a target value or range of values for a service level that is measured by an SLI.

A natural structure for SLOs is thus SLI ≤ target, or lower bound ≤ SLI ≤ upper bound.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is an SLA?

A

Service Level Agreement - an explicit or implicit contract with your users that includes consequences of meeting (or missing) the SLOs they contain.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What SLIs do user facing systems care about?

A

Availability, latency, and throughput.

In other words: Could we respond to the request? How long did it take to respond? How many requests could be handled?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What SLIs do storage systems care about?

A

Latency, availability, and durability.

In other words: How long does it take to read or write data? Can we access the data on demand? Is the data still there when we need it?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What SLIs do big data systems care about?

A

Throughput and end-to-end latency.

In other words: How much data is being processed? How long does it take the data to progress from ingestion to completion?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What SLIs should all systems care about?

A

All systems should care about correctness:

was the right answer returned, the right data retrieved, the right analysis done? Correctness is important to track as an indicator of system health, even though it’s often a property of the data in the system rather than the infrastructure per se, and so usually not an SRE responsibility to meet.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are some examples of modern monitoring systems that provide metrics for SLIs?

A

Prometheus and Borgmon

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

When would metrics not be served by providing an average?

A

Metrics like request latency, because one very long request could be obscured by many short ones. Percentiles can often be better to show the shape of the request length spectrum.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly