l10 observability Flashcards

Question 1

Q

Observability

Answer

A

Telemetry Collection:
* Collecting metrics from various sources across all
layers (L3 – L7)
* Needs to gain metrics from infrastructure as well as
elements of application (deployment & services)
Analytics and Visibility
* Visualizing and Analyzing Metrics
* Mechanism for reporting anomalies
* Packet-Capture from Pod-to-Pod necessary
Security and Troubleshooting
* Tracing or Service Meshing
* Mechanisms for Prediction and Analytics

Question 2

Q

eBPF

Answer

A

Extensible Berkley Packet Filters:
* Sandboxed Programs in Userspace in
Kernel
* Cillium (as one example) for better
networking instead of iptables

Question 3

Q

Target of Observability, Critical Path

Answer

A

A span A is part of the critical path if and
only if:
– A’s parent is blocked on A’s completion
at time t
– A is not blocked on any child span’s
completion at time

Question 4

Q

SLI

Answer

A

Service Level Indicator
What are we measuring?
E.g. How much time take the search results

Base for defining availability
For one specitic action/attribute
Of one specific service
Examples to be defined
Golden Signals of one specific service of
one operation (the concise the better) for
network service
Durability for storage
Correctness for computation
Just the metrics, no thresholds / rules to meet
Need to be derived automatically

Question 5

Q

SLO

Answer

A

Service Level Objective
How well do we perform on the SLI?
E.g. Queries should be return results within 500ms

Threshold to be hold for defined SLIs:
SLI <= target threshold
Technically hard to define, need to be refined
Wrong SLI à no use
Threshold too low à Customer/Services affected
Threshold too high à Too many incidents, false alarms
Must be simple yet holistic
Avoid absolutes (always available, for all data accesses, etc.)
Organizational hard to develop
Must be defined with product management
Have as few SLOs as possible (but as many as necessary)
p95(http_latency[path=webappl/impressum}) < 50

Question 6

Q

SLA

Answer

A

Service Level Agreement
Consequences for missing objectives
E.g. Apologize, payback, …

Result if SLO is not met
Legal and easy language with fixed defined consequences
Promise against Customer defined by Product Management (not DevOps any more)
Not of interest for the rest of the lecture since not definable by sourcecode but in contracts

Question 7

Q

4 causes for failure

Answer

A

Internal System Changes,
Changes in User Behaviour,
Changes in dependencies
Changes in platform

those are system boundaries

Question 8

Q

Availability, Parallel vs Serial / Sequentiell

Answer

A

Parallel:
* HA-Setup of same services
* E.g. Horizontal Scaling
Parallel Component = 1 - ( 1 - obe)*(1-unde)
Serial:
Series Component = C1 * C2 * C3 * C4…
* Different Services of different kinds
* E.g. Database and Messaging and Load Balance

l10 observability Flashcards

(8 cards)