Why monitor Flashcards
Name main reasons for monitoring
- detect issue fast, before client calls
- clear picture of resources usage - spending optimisation
What to monitor
- resources usage: CPU, Memory, Disk space
- request rate
- error rate
- number of instances of apps
name possible samples
- timestamp + value, (example: temp in New York given day),
- timeseries: list of samples ordered by time (example: temp in New York for few days in a row)
What is TSDB?
- timeseries database (TSDB): special type of DB used to store timeseries, optimised toward querying by time
explain quantiles and percentiles
they are the same but represented differently: for example 90th quantile is 0.9 percentile,
- 90% of the time a request takes no more than 1 s OR
- 0.9 quantile equals 1 OR
- 90th percentile equals 1
give an example of metric name and it’s labels
temperature {city=’New York’, value=’33’}
temperature is metric name, city and value are labels
what are jobs and instance labels?
Prometheus gives those two values out of the box. Job is a service name, for example location-service, instance is a node on which the service is running on
what is instant vector?
list of samples where all of them contain the same timestamp, for example temperature in different places and different values but at the same time!
what is range vector?
subset of samples from a timeseries in a given time range, for example temperature in LosAngeles from last 20s