Operational Excellence Flashcards

1
Q

What are the benefits of infrastructure as code?

A

IaC provides consistency - admins run the same automated steps for each deploy and can add comments in source control, making it self-documenting as well.

It also simplifies the deployment of complex infrastructures and provides scalability (you can push out a new instance within min or seconds).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the benefits of microservices?

A

Microservices are smaller and more modular than monolithic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is meant by canary testing? What are tools GC provides to help with this?

A

Canary testing is the practice of rolling out changes to just a small sample of users to reduce risk and validate functionality.

You can use GCE managed instance groups, collections of VM instances, that are managed as a single entity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are questions to ask around release engineering?

A

How does your dev team manage builds and releases?

What’s your process for rolling back changes?

How do you test your applications before deployment?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are some strategies for achieving operational excellence?

A

Automate your build, test, and deploy processes - perform operations as code; make frequent, small, reversible changes (CI/CD practices)

Monitor business-driven metrics (and system health metrics that align)

Refine operations/processes frequently

Conduct DR testing regularly

Review lessons learned

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What tools help with CI/CD?

A

Cloud Source Repositories, Container Registry, Cloud Build

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the four golden signals for monitoring your system?

A

Latency - time it takes to service a request
Traffic - how much demand is being placed on your system
Errors - rate of requests that fail
Saturation - how full your service is (i.e. I/O-constrained or memory-constrained)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What tools help with monitoring business/system health?

A

Cloud Monitoring - metrics collection/aggregation, dashboards, alerts

Cloud Logging - search and export to BigQuery, Cloud Storage, or Pub/Sub

Cloud Trace

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What metrics are key to DR planning?

A

Recovery time objective (RTO) - the maximum acceptable length of time that your application can be offline. Usually defined as part of an SLA.

Recovery point objective (RPO) - the maximum acceptable length of time during which data might be lost from your application

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What’s the relationship between cost to run an application and RTO/RPO values?

A

The smaller the RTO/RPO values, the more your application costs to run.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What’s the difference between SLAs and SLOs?

A

An SLA is the entire agreement that specifies what services are to be provided and details around support, cost, performance, penalties, times, etc.

SLOs are specific, measurable characteristics of the SLA, such as availability, throughput, response time, quality.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How does GC help with DR planning?

A

GCE offers incremental backups/snapshots using on Persistent Disk that you can copy across regions in the event of a disaster.

Live Migration keeps your VMs running even when a host system occurs, such as a software or hardware update.

Cloud Storage offers object storage in different classes, such as Nearline and Coldline, for backup.

Cloud DNS uses Google’s global network to serve DNS zones from redundant locations around the world. Allows you to manage DNS entries during recovery process.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

The Story

A

Cloud service providers can help make sure your people, processes, and technology run effectively to meet your business objectives. GC provides services that help reduce the complexities and costs that are common with application deployments, monitoring your business- and system-level KPIs, managing risk, and business continuity planning.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly