Cloud Native Observability Flashcards
Higher goal of observability
analysis of collected data
Observability should give answers to questions:
Is the system stable or does it change its state when manipulated?
Is the system sensitive to change, e.g. if some services have high latency?
Do certain metrics in the system exceed their limits?
Why does a request to the system fail?
Are there any bottlenecks in the system?
term observability stems from
control theory
Logs
These are messages that are emitted from an application when errors, warnings or debug information should be presented. A simple log entry could be the start and end of a specific task that the application performed.
Metrics
Metrics are quantitative measurements taken over time. This could be the number of requests or an error rate.
Traces
They track the progression of a request while it’s passing through the system. Traces are used in a distributed system that can provide information about when a request was processed by a service and how long it took.
Telemetry
Collecting data points and transferring them to external systems
___ is an open source monitoring system
Prometheus
Four core metrics in Prometheus
Counter
Gauge
Histogram
Summary
Prometheus Counter
Value that only goes up
Prometheus Gauge
Value that can both increase and decrease
Prometheus Histogram
A sample of observations
Prometheus Summary
Histogram+total count of observations
Tool to create alerts from Prometheus
AlermManager
Used to build Prometheus dashboards
Grafana