- time it takes to service a request - selectively measures successful or error requests

L7 - Cloud Monitoring Flashcards by Paolo Oppelt

Why monitor?

To make best use of your rented resources to reduce your costs and increase satisfaction of the users of your service

How well did you know this?

Not at all

Perfectly

observable system

one that exposes enough data about itself so that generating information (finding answers to questions yet to be formulated) and easily accessing this information becomes simple

How well did you know this?

Not at all

Perfectly

monitoring

process of collecting status information of applicaitons and resources; the data can be used to observe application and infrastructure

How well did you know this?

Not at all

Perfectly

monitoring system

consists of all components for gathering monitoring data at runtime

How well did you know this?

Not at all

Perfectly

2 ways to create information

proactively: through continuous analysis for triggering alarms or to give an overview of the status of the system
reactively: triggering through events such as incidents (e.g. root cause analysis and autoscaling)

How well did you know this?

Not at all

Perfectly

What is the purpose of monitoring at the infrastructure level?

resource management
incident detection
root cause analysis
metering for payment
auditing

How well did you know this?

Not at all

Perfectly

What is the purpose of monitoring at application level?

performance analysis
resource management
failure detection and resolution
SLA verification
auditing

How well did you know this?

Not at all

Perfectly

How does monitoring take place in a parallel system?

batch system
data are collected during an application run
analysis happens post mortem
execution is reproducable

How well did you know this?

Not at all

Perfectly

How does monitoring take place in the cloud?

interactive system
data are continuously produced - realtime data
realtime analysis
data used for immediate action or to study past system behavior

How well did you know this?

Not at all

Perfectly

3 pillars of monitoring

metrics
logs
traces

How well did you know this?

Not at all

Perfectly

4 important metrics in monitoring

latency
throughput or traffic
error rate
utilization or saturation

How well did you know this?

Not at all

Perfectly

What is latency

time it takes to service a request
selectively measures successful or error requests

How well did you know this?

Not at all

Perfectly

What is throughput or traffic?

web services: requests/second
streaming system: network I/O rate or concurrent sessions
database: transaction/second or retrievals per second

How well did you know this?

Not at all

Perfectly

What is the error rate?

rate of requests that fail

How well did you know this?

Not at all

Perfectly

What is utilization or saturation?

percentage of capacity
CPU, memory, I/O

How well did you know this?

Not at all

Perfectly

For what are metrics collected for Microservices?

Study These Flashcards

Autoscaling, performance tuning

What for are metrics collected for the platform like K8s or Docker?

Study These Flashcards

container distribution, autoscaling VM cluster

What are metrics collected for the infrastructure?

Study These Flashcards

root cause analysis

For what are metrics collected for the hardware?

Study These Flashcards

management of VMs

Monitoring system requirements

Study These Flashcards

comprehensive
low intrusion
extensibility
scalability
elasticity
accuracy

What is Blackbox Monitoring?

Study These Flashcards

the monitoring system is handled as a black box
no data are gained from inside of the system
e.g. only the request interface of a service is visible. Nothing about the internal structure

from internet:
Black box monitoring refers to the monitoring of servers with a focus on areas such as disk space, CPU usage, memory usage, load averages, etc.

What is whitebox monitoring?

Study These Flashcards

data is also from inside of the system
this gives more context and more detailed insights
e.g. internal organization of a service is visible

e.g. Performing advanced detection of behavior we don’t expect to see, such as a user not going through the normal steps you’d expect when signing into your application or resetting a password.

Why is there overhead in monitoring?

Study These Flashcards

instrumentation
computation for aggregations
memory overhead for buffering
time to push to disk or transfer to collector
storage overhead for long-term storage

What is instrumentation?

Study These Flashcards

Instrumentation is the process of adding code to your application so you can understand its inner state

What does overhead lead to

intrusion

How can overhead in monitoring be reduced?

- number of metrics - measurement frequency - representation - batching - sampling - long-term coarsening

What is a log?

sequence of immutable records of discrete events

What can an event log be composed of?

- plaintext = most common format of logs - structured = much evangelized, typically JSON - binary = think logs in the Protobuf format

What is ELK Stack?

ELK is the acronym for three open-source projects: - Elasticsearch: search and analytics engine - Logstash: server-side data processing pipeline - Kibana: lets users visualize data with charts and graphs

What is Elastic Stack

next evolution of ELK Stack = the open source, distributed, RESTful, JSON-based search engine. Easy to use, scalable and flexible, it earned hyper-popularity among users and a company formed around it, you know, for search. ELK + Beats and X-Pack

L7 - Cloud Monitoring Flashcards

(30 cards)