Google Cloud Operation Suite - Resource Monitoring, logging... Flashcards
About Claud Monitoriong
Cloud Monitoring dynamically configures monitoring after resources are deployed and has intelligent defaults that allow you to easily create charts for basic monitoring activities.
This allows you to monitor your platform, system, and application metrics by ingesting data, such as metrics, events, and metadata.
You can then generate insights from this data through dashboards, charts, and alerts.
What is the root entity of claud monitoring
metrics scope is the root entity that holds monitoring and configuration information in Cloud Monitoring.
How many metrics scopes you can have to monitor one google project
Only one.
You can have as many metrics scopes as you want, but Google Cloud projects and AWS accounts can’t be monitored by more than one metrics scope.
Each metrics scope can have between 1 and 100 monitored projects.
What metrics scope contains
metrics scope contains
the custom dashboards,
alerting policies,
uptime checks,
notification channels,
and group definitions that you use with your monitored projects.
Does metrics scope import data it monitors
A metrics scope can access metric data from its monitored projects, but the metrics data and log entries remain in the individual projects.
What is the root project in metrics scope
The first monitored Google Cloud project in a metrics scope is called the hosting project, and it must be specified when you create the metrics scope.
The name of that project becomes the name of your metrics scope.
What you need to configure in your project to be able to access AWS account
To access an AWS account, you must configure a project in Google Cloud to hold the AWS Connector.
What means metrics scope is a “single pane of glass”
metrics scopes can monitor all your Google Cloud projects in a single place, a metrics scope is a “single pane of glass” through which you can view resources from multiple Google Cloud projects and AWS accounts.
What do you need to do in order to give people different visibilities to monitoring data
In order to give people different roles per-project and to control visibility to data, consider placing the monitoring of those projects in separate metrics scopes.
Because a role assigned to one person on one project applies equally to all projects monitored by that metrics scope.
Does Cloud Monitoring allow you to create custom dashboards
Cloud Monitoring allows you to create custom dashboards that contain charts of the metrics that you want to monitor.
What is recommended for configuring custom charts and dasboards
These charts can be customized with filters to remove noise, groups to reduce the number of time series, and aggregates to group multiple time series together.
To create alerting policies that notify you when specific conditions are met.
best practices when creating alerts
We recommend:
- alerting on symptoms, and not necessarily causes.
- make sure that you are using multiple notification channels, like email and SMS.
- customizing your alerts to the audience’s needs by describing what actions need to be taken or what resources need to be examined.
- avoid noise, because this will cause alerts to be dismissed over time.
For example, imagine a game server that has a capacity of 50 users.
What metric indicator might you use to trigger scaling events?
From an infrastructure perspective, you might consider using CPU load or perhaps network traffic load as values that are somewhat correlated with the number of users.
But with a Custom Metric, you could actually pass the current number of users directly from your application into Cloud Monitoring.
Why is monitoring important(to google)
It is at the base of site reliability which incorporates aspects of software engineering and applies that to operations whose goals are to create ultra-scalable and highly reliable software systems
What Google Cloud’s operation suite provides:
Monitoring is the basis of Google Cloud’s operation suite, but the service also provides logging, error reporting, and tracing.
About Cloud logging
Cloud Logging allows you to store, search, analyze, monitor, and alert on log data and events from Google Cloud and AWS.
It is a fully managed service that performs at scale and can ingest application and system log data from thousands of VMs.
GC Operation Suite logging architecture
Logging includes storage for logs, a user interface called Logs Explorer, and an API to manage logs programmatically.
How long are logs maintained
Logs are only retained for 30 days, but you can export your logs to Cloud Storage buckets, BigQuery datasets, and Pub/Sub topics.
Why should you export to BigQuery or Pub/Sub?
Exporting logs to BigQuery allows you to analyze logs and even visualize them in Looker Studio.
This allows you to analyze logs, such as your network traffic, so that you can better understand traffic growth to forecast capacity, network usage to optimize network traffic expenses, or network forensics to analyze incidents.
Example how to analyze logs
I queried my logs to identify the top IP addresses that have exchanged traffic with my web server.
Depending on where these IP addresses are, and who they belong to, I could relocate part of my infrastructure to
save on networking costs or deny some of these IP addresses if I don’t want them to access my web server.
Looker Studio
You can connect your BigQuery tables to Looker Studio.
Looker Studio transforms your raw data into the metrics and dimensions that you can use to create easy-to-understand reports and dashboards.
PubSub
enables you to stream logs to applications or endpoints.
Google Cloud’s operations suite: Error Reporting.
Error Reporting counts, analyzes, and aggregates the errors in your running cloud services.
A centralized error management interface displays the results with sorting and filtering capabilities, and you can even set up real-time notifications when new errors are detected.
Currently, Error Reporting is generally available for
App Engine on both standard and flexible environments, Apps Script, Compute Engine, Cloud Functions, Cloud Run, Google Kubernetes Engine, and Amazon EC2.
In terms of programming languages Error Reporting exception stack trace parser is able to process
Go, Java, .NET, Node.js, PHP, Python, and Ruby.
Cloud Trace
is a distributed tracing system that collects latency data from your applications and displays it in the Google Cloud console.
It measures the amount of time it takes for your application to handle incoming requests and perform operations is an important part of managing overall application performance.
Applications supported by cloud trace
App Engine, HTTP(S) load balancers, and applications instrumented with the Cloud Trace API.
Google Cloud’s profiler
uses statistical techniques and extremely low-impact instrumentation that runs across all production application instances to provide a complete picture of an application’s performance without slowing it down.
It has support for Java, Go, Node.js, and Python.