Reliable Cloud Infrastructure: Design and Process Flashcards
The course introduces design using a three-tier design model. The three tiers are:
Presentation layer, Business-logic layer, and Data layer.
The design process includes which steps?
Begin simple and iterate. Plan for failure. Measure.
What is it called when information must be preserved to perform a subsequent step?
Stateful. State information is stored.
What is the focus of measurement?
Measure what the user cares about.
How does a microservices design complicate business logic ?
Key business logic is implemented as cross-services communication.
Which GCP platform services are identified as useful for the 12-factor principle of “store configuration information in the environment”?
Google Cloud Storage and the Metadata Server.
What tradeoff occurs with the 12-factor principle of “store state information in the environment”?
Storing state information in the environment is slower than storing it locally on SSD.
Which platform processing service is designed to offer the lowest IT overhead so you can focus on the application?
Google App Engine (GAE)
What advice is given on horizontal scaling design?
Prefer small stateless servers. Keep servers simple; do one thing well.
What does Data Integrity mean?
That users have access to their data and that the data persists without being corrupted or lost.
What is the difference between a proxied and a pass-through load balancer?
A proxied load balancer terminates the incoming connection and initiates a separate connection, a pass-through redirects traffic without terminating it.
Which form of load balancing enables you to load balance behind an IP address that is only accessible to instances within your Virtual Private Cloud (VPC)?
Internal load balancing.
What is the service provided by a third party (such as an ISP) that enables you to connect another cloud directly to your Google cloud resources to create hybrid cloud solutions?
Dedicated interconnect.
The design process includes which steps?
Begin simple and iterate. Plan for failure. Measure.
What is it called when information must be preserved to perform a subsequent step?
Stateful. State information is stored.
What is the focus of measurement?
Measure what the user cares about.
What are the categories of requirements described in gathering requirements?
Quantitative, qualitative, scaling, and size.
What does Data Integrity mean?
That users have access to their data and that the data persists without being corrupted or lost.
What reason is given for the design advice to “design first and dimension later”?
Trying to optimize cost or optimize for size (dimension) before the design is fully developed can lead to confusion and ambiguities in the design process.
What is the key advice presented about GCP deployment?
Automate everything you can – because launch and release automation has an influence over reliability.
What is the difference between black box monitoring and white box monitoring?
In black box monitoring you can only monitor external observable events, whereas, in white box monitoring, you can also monitor the system’s internal events.
From the bottom up, what are the first three layers in the Site Reliability Engineering pyramid?
Monitoring. Incident Response. Post Mortem / Root Cause Analysis.
What are the steps in the capacity planning cycle?
Allocate. Approve. Deploy. Forecast.
What are three methods for reducing the price of virtual machines (VMs) in GCP?
Sustained use discounts. Committed use discounts. Preemptible VMs.
What does “pervasive defense in depth” mean?
Segregation of duties; Google handles some things, others are your responsibility.
In most network devices such as a firewall, the network is subject to overload of the capacity of the interface. What is the overload capacity of a firewall in Google’s network?
The firewall is virtual, implemented through software defined networking, so there is no physical interface to be overloaded.
Which edge features of Google’s networking provide automatic protections against Distributed Denial of Service attacks (DDoS) ?
TCP/SSL proxy, Global Load Balancing, and Cloud CDN.
Which of the following describes Cross-project VPC network peering?
Projects are isolated in separate VPCs, but using network peering they can communicate over a private address space.
When would you use CSEK (Customer Supplied Encryption Keys) ?
When you have a requirement to use your own AES-256 keys rather than those automatically generated by Google.
What is the “principle of least privilege” as it relates to IAM Policies?
Grant roles at the smallest scope needed for the individual or service account to be functional with the services they require.
What are the two main categories of failures described?
Failure due to loss of resources, and failure due to overload.
To design to overcome a single point of failure, the N+2 strategy is recommended. What is N+2?
One alternative to ‘ N’ to handle an upgrade, and a second to handle a service outage.
What is a correlated failure?
When a group of related items fail at the same time; the group is a failure domain.
How can a design to improve reliability through failover create an opportunity for overload failure?
If growth occurs and the capacity is not increased to accommodate the new greater load during failover.
What is a cascading failure?
When, due to an overload failure, the system seeks additional resources and spreads the overload until the system loses integrity.
What is a fan-in or incast failure?
When many individual requests are responded to multiple times in error.
What is it called when you are trying to make a system more reliable by adding retries and it creates the opportunity for an overload failure?
Positive feedback cycle overload failure.
What is the recommended action to help cope with failure that involves Objectives and Indicators?
Incorporate failure planning including a margin of safety and scheduled downtime into the SLOs and SLIs.
Why is DNS recommended for business continuity and disaster recovery?
Because you can use it to redirect client requests to an alternate backup service by changing the DNS definition.
What is a lazy deletion strategy?
When a client deletes an object, it is not annihilated immediately, but concealed and preserved for a period. There may be multiple tiers in the deletion strategy that permit different kinds of recovery of the object.
What is a key technology for scalable and resilient design that enables both scaling of capacity and redirecting traffic to alternate resources in the event of a failure?
Load balancing.