PCA Review Deck Flashcards

1
Q

What is AWS connector project?

A

An AWS connector project is a Google Cloud project that lets Cloud Monitoring read metrics for a specific AWS account. The following diagram shows a Google Cloud project that has an AWS connector project as a monitored project. That AWS connector project reads the metrics from an AWS account and then stores those metrics:

An AWS connector project lets you read metrics from an AWS account.

The AWS connector project is created when you connect your AWS account to Google Cloud. For information about these steps, see Connect your AWS account to Google Cloud.

To display your AWS account metrics in multiple Google Cloud projects, connect your AWS account to Google Cloud, and then follow the steps in Add AWS connector projects to a metrics scope.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How can you expand the set of metrics that a project can access by adding other Google Cloud projects?

A

By default, a Google Cloud project has visibility only to the metrics it stores. However, you can expand the set of metrics that a project can access by adding other Google Cloud projects to the project’s metrics scope. The metrics scope defines the set of Google Cloud projects whose metrics the current Google Cloud project can access.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the best practices for scoping projects when you have multiple projects you want to monitor?

A

We recommend that you use a new Cloud project or one without resources as the scoping project when you want to view metrics for multiple Cloud projects or AWS accounts.

When a metrics scope contains monitored projects, to chart or monitor only those metrics stored in the scoping project, you must specify filters that exclude metrics from the monitored projects. The requirement to use filters increases the complexity of chart and alerting policy, and it increases the possibility of a configuration error. The recommendation ensures that these scoping projects don’t generate metrics, so there are no metrics in the projects to chart or monitor.

The previous example follows our recommendation. The scoping project, AllEnvironments, was created and then the Staging and Production projects were added as monitored projects. To view or monitor the combined metrics for all projects, you use the metrics scope for the AllEnvironments project. To view or monitor the metrics stored in the Staging project, you use the metrics scope for that project.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are stackdriver groups?

A

you can use Stackdriver Groups to organize a subset of the resources your team cares about, such as one microservice.
Users within a Workspace all have common view permissions, so that everyone on the team collaborating on an application’s dashboard or debugging an incident generated from an alerting policy will have the same view.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

how do you organizing cloud operation Workspaces by environment

A

Organizing by environment means that Workspaces are aligned to environments such as development, staging, and production. In this case, projects are included in separate Workspaces based on their function in the environment. For example, splitting the projects along development and staging/production environments would result in two Workspaces: one for development and one for staging/production, as shown.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a metric?

A

Operations Suite supports the alerts creation based on predefined metrics.

A metric is a defined measurement using a resource based on a regular period of time. Metrics leverage mathematical calculations to measure outcomes.
Examples of metrics available using Operations Suite, and specifically the Stackdriver API, include maximum, minimum, average, and mean. Each of these mathematical calculations might evaluate CPU utilization, memory usage, and network activity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a workspace?

A

Workspaces
Cloud Monitoring requires an organizational tool to monitor and collect information. In GCP, that tool is called a Workspace.

The Workspace brings together Cloud Monitoring resources from one or more GCP projects. It can even bring in third-party account data from other cloud providers, including Amazon Web Services.

The Workspace collects metric data from one or more monitored projects; however, the data remains project bound. The data is pulled into the Workspace and then displayed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the rules regarding provisioning a workspace?

A

A Workspace can manage and monitor data for one or more GCP projects.

A project, however, can only be associated with a single Workspace.

*Monitoring Editor

*Monitoring Admin

*Project Owner

Before you create a new Workspace, you need to identify who in the organization has roles in a given project.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the GCP best practices for workspaces when you have to monitor multiple projects?

A

Create a seperate project to manage all the activity across mulltple workspaces.
You can add or merge workspaces, but each project can ony be assigned workspace.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the 3 types of zonal/regional clusters?

A

Single-zone clusters
A single-zone cluster has a single control plane running in one zone. This control plane manages workloads on nodes running in the same zone.

Multi-zonal clusters
A multi-zonal cluster has a single replica of the control plane running in a single zone, and has nodes running in multiple zones. During an upgrade of the cluster or an outage of the zone where the control plane runs, workloads still run. However, the cluster, its nodes, and its workloads cannot be configured until the control plane is available. Multi-zonal clusters balance availability and cost for consistent workloads. If you want to maintain availability and the number of your nodes and node pools are changing frequently, consider using a regional cluster.

Regional clusters
A regional cluster has multiple replicas of the control plane, running in multiple zones within a given region. Nodes in a regional cluster can run in multiple zones or a single zone depending on the configured node locations. By default, GKE replicates each node pool across three zones of the control plane’s region. When you create a cluster or when you add a new node pool, you can change the default configuration by specifying the zone(s) in which the cluster’s nodes run. All zones must be within the same region as the control plane.
https://cloud.google.com/kubernetes-engine/docs/concepts/types-of-clusters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What Cloud Storage systems are there for granting users permission to access your buckets and objects?

A

Cloud Storage offers two systems for granting users permission to access your buckets and objects: IAM and Access Control Lists (ACLs). These systems act in parallel - in order for a user to access a Cloud Storage resource, only one of the systems needs to grant the user permission.
IAM - grant permissions at the bucket and project levels.
ACLs - used only by Cloud Storage and have limited permission options, per-object basis.

uniform bucket permissioning system
disables ACLs
Resources granted exclusively through IAM. After you enable uniform bucket-level access, you can reverse your decision for 90 days.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What do you need to do to protect your org after you creat4 a billing account and setup projects?
Why?

A

When an organization resource is created, all users in your domain are granted the Billing Account Creator and Project Creator roles by default. These default roles allow your users to start using Google Cloud immediately, but are not intended for use in regular operation of your organization resource.
the organization resource.
Removing default roles from the organization resource
After you designate your own Billing Account Creator and Project Creator roles, you can remove these roles from the organization resource to restrict those permissions to specifically designated users. To remove roles from the organization resource

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

You want to deploy an application to a Kubernetes Engine cluster using a manifest file called my-app.yaml.

What command would you use?

A

kubectl apply -f my-app.yaml
kubectl apply -k dir

Explanation
Part of the app management commands.

The correct answer is to use the “kubectl apply -f” with the name of the deployment file. Deployments are Kubernetes abstractions and are managed using kubectl, not gcloud. The other options are not valid commands. For more information, see https://kubernetes.io/docs/reference/kubectl/overview/.

The command set kubectl apply is used at a terminal’s command-line window to create or modify Kubernetes resources defined in a manifest file. This is called a declarative usage. The state of the resource is declared in the manifest file, then kubectl apply is used to implement that state.

In contrast, the command set kubectl create is the command you use to create a Kubernetes resource directly at the command line. This is an imperative usage. You can also use kubectl create against a manifest file to create a new instance of the resource. However, if the resource already exists, you will get an error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Kubernetes Engine collects application logs by default when the log data is written where?

A

app logs: STDOUT and STDERR

In addition to cluster audit logs, and logs for the worker nodes, GKE automatically collects application logs written to either STDOUT or STDERR. If you’d prefer not to collect application logs, you can also now choose to collect only system logs. Collecting system logs are critical for production clusters as it significantly accelerates the troubleshooting process. No matter how you plan to use logs, GKE and Cloud Logging make it simple and easy–simply start your cluster, deploy your applications and your logs appear in Cloud Logging!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Where does GKE collect Cluster logs?

A

By default, GKE clusters are natively integrated with Cloud Logging (and Monitoring). When you create a GKE cluster, both Monitoring and Cloud Logging are enabled by default. That means you get a monitoring dashboard specifically tailored for Kubernetes and your logs are sent to Cloud Logging’s dedicated, persistent datastore, and indexed for both searches and visualization in the Cloud Logs Viewer.

If you have an existing cluster with Cloud Logging and Monitoring disabled, you can still enable logging and monitoring for the cluster. That’s important because with Cloud Logging disabled, a GKE-based application temporarily writes logs to the worker node, which may be removed when a pod is removed, or overwritten when log files are rotated. Nor are these logs centrally accessible, making it difficult to troubleshoot your system or application.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Where would you view your GKE logs?

A

Cloud Logging, and its companion tool Cloud Monitoring, are full featured products that are both deeply integrated into GKE. In this blog post, we’ll go over how logging works on GKE and some best practices for log collection. Then we’ll go over some common logging use cases, so you can make the most out of the extensive logging functionality built into GKE and Google Cloud Platform.

Cloud Logging console – You can see your logs directly from the Cloud Logging console by using the appropriate logging filters to select the Kubernetes resources such as cluster, node, namespace, pod or container logs. Here are some sample Kubernetes-related queries to help get you started.

GKE console – In the Kubernetes Engine section of the Google Cloud Console, select the Kubernetes resources listed in Workloads, and then the Container or Audit Logs links.

Monitoring console – In the Kubernetes Engine section of the Monitoring console, select the appropriate cluster, nodes, pod or containers to view the associated logs.

gcloud command line tool – Using the gcloud logging read command, select the appropriate cluster, node, pod and container logs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is the difference between Regional and global IP addresses?

A

When you list or describe IP addresses in your project, Google Cloud labels addresses as global or regional, which indicates how a particular address is being used. When you associate an address with a regional resource, such as a VM, Google Cloud labels the address as regional. Regions are Google Cloud regions, such as us-east4 or europe-west2.

For more information about global and regional resources, see Global, regional, and zonal resources in the Compute Engine documentation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

As a developer using GCP, you will need to set up a local development environment. You will want to authorize the use of gcloud commands to access resources. What commands could you use to authorize access?

A

gcloud init
Explanation
Gcloud init will authorize access and perform other common setup steps. Gcloud auth login will authorize access only. Gcloud login and gcloud config login are not valid commands.

You can also run gcloud init to change your settings or create a new configuration.

gcloud init performs the following setup steps:

Authorizes the gcloud CLI to use your user account credentials to access Google Cloud, or lets you select an account if you have previously authorized access
Sets up a gcloud CLI configuration and sets a base set of properties, including the active account from the step above, the current project, and if applicable, the default Compute Engine region and zone
https://cloud.google.com/sdk/docs/initializing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

gcloud auth login
Authorize with a user account without setting up a configuration.

A

gcloud auth login [ACCOUNT] [–no-activate] [–brief] [–no-browser] [–cred-file=CRED_FILE] [–enable-gdrive-access] [–force] [–no-launch-browser] [–update-adc] [GObtains access credentials for your user account via a web-based authorization flow. When this command completes successfully, it sets the active account in the current configuration to the account specified. If no configuration exists, it creates a configuration named default.
If valid credentials for an account are already available from a prior authorization, the account is set to active without rerunning the flow.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

You have a Cloud Datastore database that you would like to backup. You’d like to issue a command and have it return immediately while the backup runs in the background. You want the backup file to be stored in a Cloud Storage bucket named my-datastore-backup. What command would you use?

A

gcloud datastore export gs://my-datastore-backup –async

Explanation
The correct command is gcloud datastore export gs://my-datastore-backup –async. Export, not backup, is the datastore command to save data to a Cloud Storage bucket. Gsutil is used to manage Cloud Storage, not Cloud Datastore. For more information, see https://cloud.google.com/datastore/docs/export-import-entities.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

How do you setup a database for export?

A

Before you begin
Before you can use the managed export and import service, you must complete the following tasks.

Enable billing for your Google Cloud project. Only Google Cloud projects with billing enabled can use the export and import functionality.

Create a Cloud Storage bucket in the same location as your Firestore in Datastore mode database. You cannot use a Requester Pays bucket for export and import operations.

Assign an IAM role to your user account that grants the datastore.databases.export permission, if you are exporting data, or the datastore.databases.import permission, if you are importing data. The Datastore Import Export Admin role, for example, grants both permissions.

If the Cloud Storage bucket is in another project, give your project’s default services account access to the bucket.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Authorize with a user account
Use the following gcloud CLI commands to authorize access with a user account:

A

Command Description
gcloud init Authorizes access and performs other common setup steps.
gcloud auth login Authorizes access only.

During authorization, these commands obtain account credentials from Google Cloud and store them on the local system.
The specified account becomes the active account in your configuration.
The gcloud CLI uses the stored credentials to access Google Cloud. You can have any number of accounts with stored credentials for a single gcloud CLI installation, but only one account is active at a time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

A manager in your company is having trouble tracking the use and cost of resources across several projects. In particular, they do not know which resources are created by different teams they manage. What would you suggest the manager use to help better understand which resources are used by which team?

A

Labels are key-value pairs attached to resources and used to manage them. The manager could use a key-value pair with the key ‘team-name’ and the value the name of the team that created the resource. Audit logs do not necessarily have the names of teams that own a resource. Traces are used for performance monitoring and analysis. IAM policies are used to control access to resources, not to track which team created them.
For more information, see
https://cloud.google.com/resource-manager/docs/creating-managing-labels

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

You have created a target pool with instances in two zones which are in the same region. The target pool is not functioning correctly. What could be the cause of the problem?

A

The target pool is missing a health check.
Target pools must have a health check to function properly. Nodes can be in different zones but must be in the same region. Cloud Monitoring and Cloud Logging are useful but they are not required for the target pool to function properly. Nodes in a pool have the same configuration. For more information, see https://cloud.google.com/load-balancing/docs/target-pools

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is an External NLB Target pool based load balancer look like?

A

Google Cloud external TCP/UDP Network Load Balancing (after this referred to as Network Load Balancing) is a regional, pass-through load balancer. A network load balancer distributes external traffic among virtual machine (VM) instances in the same region.

You can configure a network load balancer for TCP, UDP, ESP, GRE, ICMP, and ICMPv6 traffic.

A network load balancer can receive traffic from:

Any client on the internet
Google Cloud VMs with external IPs
Google Cloud VMs that have internet access through Cloud NAT or instance-based NAT

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is a target pool?

A

Target pools
A target pool resource defines a group of instances that should receive incoming traffic from forwarding rules. When a forwarding rule directs traffic to a target pool, Cloud Load Balancing picks an instance from these target pools based on a hash of the source IP and port and the destination IP and port. Each target pool operates in a single region and distributes traffic to the first network interface (nic0) of the backend instance. For more information about how traffic is distributed to instances, see the Load distribution algorithm section in this topic.

The network load balancers are not proxies. Responses from the backend VMs go directly to the clients, not back through the load balancer. The load balancer preserves the source IP addresses of packets. The destination IP address for incoming packets is the regional external IP address associated with the load balancer’s forwarding rule.

For architecture details, see network load balancer with a target pool backend.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What are Health checks?

A

Health checks ensure that Compute Engine forwards new connections only to instances that are up and ready to receive them. Compute Engine sends health check requests to each instance at the specified frequency. After an instance exceeds its allowed number of health check failures, it is no longer considered an eligible instance for receiving new traffic.

To allow for graceful shutdown and closure of TCP connections, existing connections are not actively terminated. However, existing connections to an unhealthy backend are not guaranteed to remain viable for long periods of time. If possible, you should begin a graceful shutdown process as soon as possible for your unhealthy backend.

The health checker continues to query unhealthy instances, and returns an instance to the pool when the specified number of successful checks occur. If all instances are marked as UNHEALTHY, the load balancer directs new traffic to all existing instances.

Network Load Balancing relies on legacy HTTP health checks to determine instance health. Even if your service does not use HTTP, you must run a basic web server on each instance that the health check system can query.

Legacy HTTPS health checks aren’t supported for network load balancers and cannot be used with most other types of load balancers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

A client has asked for your advice about building a data transformation pipeline. The pipeline will read data from Cloud Storage and Cloud Spanner, merge data from the two sources and write the data to a BigQuery data set. The client does not want to manage servers or other infrastructure, if possible. What GCP service would you recommend?

A

Cloud Data Fusion

Cloud Data Fusion is a managed service that is designed for building data transformation pipelines. https://cloud.google.com/data-fusion/docs/how-to
What is Cloud Data Fusion?

bookmark_border
Cloud Data Fusion is a fully managed, cloud-native, enterprise data integration service for quickly building and managing data pipelines.

The Cloud Data Fusion web UI lets you to build scalable data integration solutions to clean, prepare, blend, transfer, and transform data, without having to manage the infrastructure.

Cloud Data Fusion is powered by the open source project CDAP. Throughout this page, there are links to the CDAP documentation site, where you can find more detailed information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What are firewall rules logging for?
How do you enable them.

A

Firewall Rules Logging lets you audit, verify, and analyze the effects of your firewall rules. For example, you can determine if a firewall rule designed to deny traffic is functioning as intended. Firewall Rules Logging is also useful if you need to determine how many connections are affected by a given firewall rule.

You enable Firewall Rules Logging individually for each firewall rule whose connections you need to log. Firewall Rules Logging is an option for any firewall rule, regardless of the action (allow or deny) or direction (ingress or egress) of the rule.

Firewall Rules Logging logs traffic to and from Compute Engine virtual machine (VM) instances. This includes Google Cloud products built on Compute Engine VMs, such as Google Kubernetes Engine (GKE) clusters and App Engine flexible environment instances.

When you enable logging for a firewall rule, Google Cloud creates an entry called a connection record each time the rule allows or denies traffic. You can view these records in Cloud Logging, and you can export logs to any destination that Cloud Logging export supports.

Each connection record contains the source and destination IP addresses, the protocol and ports, date and time, and a reference to the firewall rule that applied to the traffic.

Firewall Rules Logging is available for both VPC firewall rules and hierarchical firewall policies.
https://cloud.google.com/vpc/docs/firewall-rules-logging

30
Q

What is the difference between cloud logging and cloud monitoring?

A

Cloud Logging and Cloud Monitoring provide your IT Ops/SRE/DevOps teams with out-of-the box observability needed to monitor your infrastructure and applications.
Cloud
Logging automatically ingests Google Cloud audit and platform logs so that you can get started right away.

Cloud Monitoring provides a view of all Google Cloud metrics at zero cost and integrates with a variety of providers for non Google Cloud monitoring.

31
Q

A client of yours wants to deploy a stateless application to Kubernetes cluster. The replication controller is named my-app-rc. The application should scale based on CPU utilization; specifically when CPU utilization exceeds 80%. There should never be fewer than 2 pods or more than 6. What command would you use to implement autoscaling with these parameters?

A

kubectl autoscale rc my-app-rc –min=2 –max=6 –cpu-percent=80

The correct command is to use kubectl autoscale specifying the appropriate min, max, and cpu percent.
When you use kubectl autoscale, you specify a maximum and minimum number of replicas for your application, as well as a CPU utilization target.
For example, to set the maximum number of replicas to six and the minimum to four, with a CPU utilization target of 50% utilization, run the following command:
kubectl autoscale deployment my-app –max 6 –min 4 –cpu-percent 50
In this command, the –max flag is required. The –cpu-percent flag is the target CPU utilization over all the Pods. This command does not immediately scale the Deployment to six replicas, unless there is already a systemic demand.

After running kubectl autoscale, the HorizontalPodAutoscaler object is created and targets the application. When there is a change in load, the object increases or decreases the application’s replicas.

32
Q

Horizontal Pod Autoscaling

A

In Kubernetes, a HorizontalPodAutoscaler automatically updates a workload resource (such as a Deployment or StatefulSet), with the aim of automatically scaling the workload to match demand.

Horizontal scaling means that the response to increased load is to deploy more Pods. This is different from vertical scaling, which for Kubernetes would mean assigning more resources (for example: memory or CPU) to the Pods that are already running for the workload.

If the load decreases, and the number of Pods is above the configured minimum, the HorizontalPodAutoscaler instructs the workload resource (the Deployment, StatefulSet, or other similar resource) to scale back down.

Horizontal pod autoscaling does not apply to objects that can’t be scaled (for example: a DaemonSet.)

33
Q

How to scale a deployed application in Google Kubernetes Engine (GKE).

A

When you deploy an application in GKE, you define how many replicas of the application you’d like to run. When you scale an application, you increase or decrease the number of replicas.

Each replica of your application represents a Kubernetes Pod that encapsulates your application’s container(s).

34
Q

How do you create a cloud billing account - what are the prerequisites?

A

If you manage your Google Cloud resources using an Organization node, and you are a member of that Google Cloud Organization, then you must be a Billing Account Creator to create a new Cloud Billing account.
billing.accounts.create

If you are not a member of a Google Cloud Organization but instead are managing your Google Cloud resources or Google Maps Platform APIs using projects, you do not need any specific role or permission to create a Cloud Billing account.

35
Q

A photographer wants to share images they have stored in a Cloud Storage bucket called free-photos-on-gcp. What command would you use to allow all users to read these files?

A

gsutil iam ch allUsers:objectViewer gs://free-photos-on-gcp

gsutil is used with Cloud Storage, not gcloud so the gcloud ch option is wrong. The term objectViewer is the correct way to grant read access to objects in a bucket.

https://cloud.google.com/storage/docs/gsutil/commands/iam

Ch
The iam ch command incrementally updates Cloud IAM policies. You can specify multiple access grants or removals in a single command. The access changes are applied as a batch to each url in the order in which they appear in the command line arguments. Each access change specifies a principal and a role that is either granted or revoked.

You can use gsutil -m to handle object-level operations in parallel.

36
Q

An auditor is reviewing your GCP use. They have asked for access to any audit logs available in GCP. What audit logs are available for each project, folder, and organization?

A

Types of audit logs
Cloud Audit Logs provides the following audit logs for each Cloud project, folder, and organization:

Admin Activity audit logs
Data Access audit logs
System Event audit logs
Policy Denied audit logs

Cloud Audit Logs maintain three audit logs: Admin Activity logs, Data Access logs, and System Event logs. There is no such thing as a Policy Access log, a User Login log, or a Performance Metric log in GCP Audit Logs. For more information, see https://cloud.google.com/logging/docs/audit

37
Q

Before you can use your domain with Cloud Identity, you need to verify that you own it.
What is a domain, why verify, how verify?

A

Cloud Identity provides domain verification records, which are added to DNS settings for the domain. IAM is used to control access granted to identities, it is not a place to manage domains. The billing account is used for payment tracking, it is not a place to manage domains. Resources do have metadata, but that metadata is not used to manage domains. For more information on verifying domains, see https://cloud.google.com/identity/docs/verify-domain.

Your domain is your web address, as in your-company.com. Verifying your domain prevents anyone else from using it with Cloud Identity.

Why verify?
Verifying your domain is the first step in setting up Cloud Identity for your business. If you are the person who signed up for Cloud Identity, this makes you the administrator of your new account. You need to verify that you own your business domain before you can use Cloud Identity. This ensures your account is secure and that no one else can use services from your domain.

How do I verify?
You verify your domain through your domain host (typically where you purchased your domain name). Your domain host maintains records (DNS settings) that direct internet traffic to your domain name. (Go to Identify your domain host.)

Cloud Identity gives you a verification record to add to your domain’s DNS settings. When Cloud Identity sees the record exists, your domain ownership is confirmed. The verification record doesn’t affect your website or email.

38
Q

How can you setup an organizational policy restriction on geographic location?

A

Restricting Resource Locations

Create a policy at the organization level of the resource hierarchy
that includes a constraint using a Resource Location Restriction.

This guide describes how to set an organization policy that includes the resource locations constraint.

You can limit the physical location of a new resource with the **Organization Policy Service **resource locations constraint.

You can use the location property of a resource to identify where it is deployed and maintained by the service. For data-containing resources of some Google Cloud services, this property also reflects the location where data is stored. This constraint allows you to define the allowed Google Cloud locations where the resources for supported services in your hierarchy can be created.

After you define resource locations, this limitation will apply only to newly-created resources. Resources you created before setting the resource locations constraint will continue to exist and perform their function.

39
Q

A startup is implementing an IoT application that will ingest data at high speeds. The architect for the startup has decided that data should be ingested in a queue that can store the data until the processing application is able to process it. The architect also wants to use a managed service in Google Cloud. What service would you recommend?

A

Cloud Pub/Sub is a queuing service that is used to ingest data and store it until it can be processed. Bigtable is a NoSQL database, not a queueing service. Cloud Dataflow is a stream and batch processing service, not a queueing service. Cloud Dataproc is a managed Spark/Hadoop service.

For more information, see https://cloud.google.com/pubsub/docs/overview.

40
Q

You have a set of snapshots that you keep as backups of several persistent disks. You want to know the source disk for each snapshot.
What commands would you use to get that information?

A

gcloud compute snapshots list (find the name of the snapshot)
gcloud snapshots describe “snapshot name”

To run gcloud compute snapshots describe, you’ll need the name of a snapshot. To list existing snapshots by name, run:

gcloud compute snapshots list
To display specific details of an existing Compute Engine snapshot (like its creation time, status, and storage details), run:

gcloud compute snapshots describe SNAPSHOT_NAME –format=”table(creationTimestamp, status, storageBytesStatus)”

The correct command is gcloud compute snapshots describe which shows information about the snapshot, including source disk, creation time, and size. The other options are not valid gcloud commands. For more information, see https://cloud.google.com/sdk/gcloud/reference/compute/snapshots/describe

41
Q

You have deployed a sole tenant node in Compute Engine. How will this restrict what VMs run on that node?

A

Only VMs from the same project will run on the node.

Explanation
On a sole tenant node, only VMs from the same project will run on that node. They do not need to use the same operating system. Sole tenant nodes are not restricted to a single VM. VMs from the same organization but different projects will not run on the same sole tenant instance. For more information, see https://cloud.google.com/compute/docs/nodes/sole-tenant-nodes
Sole-tenancy lets you have exclusive access to a sole-tenant node, which is a physical Compute Engine server that is dedicated to hosting only your project’s VMs. Use sole-tenant nodes to keep your VMs physically separated from VMs in other projects, or to group your VMs together on the same host hardware, as shown in the following diagram.

42
Q

A group of developers are creating a multi-tiered application. Each tier is in its own project. The developer would like to work with a common VPC network. What would you use to implement this?

A

Create a shared VPC

A shared VPC allows projects to share a common VPC network. VPNs are used to link VPCs to on premises networks. Routes and firewall rules are not sufficient for implementing a common VPC. Firewall rules are not used to load balance, they are used to control the ingress and egress of traffic on a network.

https://cloud.google.com/vpc/docs/shared-vpc and https://cloud.google.com/composer/docs/how-to/managing/configuring-shared-vpc.

43
Q

A new team member has just created a new project in GCP. What role is automatically granted to them when they create the project?

A

roles/owner

Explanation
When you create a project, you are automatically granted the roles/owner role. The owner role includes permissions granted by roles/editor, roles/viewer, and roles/browser. For more information, see
https://cloud.google.com/resource-manager/docs/access-control-proj

44
Q

What is Cloud Function?

A

Cloud Function lets you deploy snippets of code (functions) written in a limited set of programming languages, to natively handle HTTP requests or events from many GCP sources.

Cloud Functions lets you establish triggers on a wide variety of events that can come from a variety of Cloud and Firebase products.

Cloud Functions are limited with respect to the libraries, languages, and runtimes supported.

45
Q

How it is different from Cloud Run and App Engine?
Cloud Functions server

A

Cloud Functions server instances handle requests in a serial manner, which is not configurable whereas Cloud Run instances handle requests in a parallel manner, and the level of parallelism is configurable.

Cloud Functions allow you to choose from a set of programming languages and runtimes that is not configurable without requiring that you do anything other than deploying your code whereas Cloud Run allows you to choose any kind of backend configuration, but it requires that you supply a docker configuration that creates the runtime environment (which is more work).

App Engine is more suitable for applications, which have numerous functionalities inter-related even unrelated with each other e.g. microservices, while cloud functions are more events-based functions and perform some single-purpose action.

It is easy to replicate Cloud Functions on Google App Engine, but replicating an App Engine application on Cloud Functions would be complicated.

46
Q

What is Auto Scaling?

A

Let’s understand Auto Scaling with the help of an example, imagine you being a web developer and you have developed a web application, now you are ready to go live on a single front-end server.
You have different layers in your applications like the web layer (front end), business layer, and database layer. On day 1, you are assuming 10 concurrent users which will ideally use 50% of your CPU Utilization but as the demand increase, you might see an increase in users from 10 to 20 or maybe more during peak time, also at some point in time, you might see a very fewer user. If you add some front-end server manually then it can be a huge overhead if your application is big and again you have to decrease the server manually. To overcome the scenario, AutoScaler came to the rescue, where you just define the instance template means the configuration of every server and instance group where you define your scaling policy. Here we are going to show you CPU Utilization over 80% policy. Autoscaling is mostly used with Load Balancer to have a single IP of all the running instances. We will cover Load Balancer in the next lab.

Compute Engine offers both managed and unmanaged instance groups, only managed instance groups can be used for Autoscaling.

47
Q

What are the different autoscaling policies available for the different instance groups?

A

While creating an Instance group, you must specify which autoscaling policy and utilization level the Autoscaler should use to determine when to scale the group. There are three policies:

Average CPU utilization.

HTTP load balancing.

Cloud Monitoring metrics.

The Autoscaler keeps on collecting usage details based on the chosen policy, and then compares actual utilization to your target utilization, and uses this information to determine whether the instance group needs to remove instances or add instances.

48
Q

High availability in Compute Engine is ensured by several different mechanisms and practices, what are they?

A

Hardware Redundancy and Live Migration
Live migration is not available for preemptible VMs, however, but preemptible VMs are not designed to be highly available. At the time of this writing, VMs with GPUs attached are not available to live migrate.
Managed Instance Groups
High availability also comes from the use of redundant VMs. Managed instance groups are the best way to create a cluster of VMs, all running the same services in the same configuration. A managed instance group uses an instance template to specify the configuration of each VM in the group. Instance templates specify machine type, boot disk image, and other VM configuration details.
Multiple Regions and Global Load Balancing
Beyond the regional instance group level, you can further ensure high availability by running your application in multiple regions and using a global load balancer to distribute workload. This would have the added advantage of allowing users to connect to an application instance in the closest region, which could reduce latency. You would have the option of using the HTTP(S), SSL Proxy, or TCP Proxy load balancers for global load balancing.

49
Q

High Availability in Kubernetes Engine
Kubernetes Engine is a managed Kubernetes service and how is it highly available?

A

VMs in a GKE Kubernetes cluster are members of a managed instance group, so they have all the high availability features described previously.
Kubernetes continually monitors the state of containers and pods. Pods are the smallest unit of deployment in Kubernetes; they usually have one container, but in some cases a pod may have two or more tightly coupled containers. If pods are not functioning correctly, they will be shut down and replaced
Kubernetes Engine clusters can be zonal or regional. To improve availability, you can create a regional cluster in GKE, the managed service that distributes the underlying VMs across multiple zones within a region. GKE replicates control plane servers and nodes across zones.
Control plane servers run several services including the API server, scheduler, and resource controller and, when deployed to multiple zones, provide for continued availability in the event of a zone failure.

50
Q

High Availability in App Engine and Cloud Functions
App Engine and Cloud Functions are fully managed compute services how do they become highly available?

A

Users of these services are not responsible for maintaining the availability of the computing resources. The Google Cloud Platform ensures the high availability of these services.

51
Q

AVAILABILITY VS. DURABILITY

A

Availability should not be confused with durability, which is a measure of the probability that a stored object will be inaccessible at some point in the future. A storage system can be highly available but not durable.

For example, in Compute Engine, locally attached storage is highly available because of the way Google manages VMs. If there was a problem with the local storage system, VMs would be live migrated to other physical servers. Locally attached drives are not durable, though. If you need durable drives, you could use Persistent Disk or Cloud Filestore, the fully managed file storage service

52
Q

How are these made better available:
Persistent disks (PDs) are SSDs and hard disk drives that can be attached to VMs.

A

These disks provide block storage so that they can be used to implement filesystems and database storage.

Persistent disks continue to exist even after the VMs shut down.

One of the ways in which persistent disks enable high availability is by supporting online resizing.

GCP offers both zone persistent disks and regional persistent disks. Regional persistent disks are replicated in two zones within a region.

53
Q

Self-Managed Databases are made available by?

A

When running and managing a database, you will need to consider how to maintain availability if the database server or underlying VM fails.
Redundancy is the common approach to ensuring availability in databases. How you configure multiple database servers will depend on the database system you are using.
Cloud SQL use replicas - additional regions, read
Bigtable has support for regional replication, which improves availability.
EHR Healthcare uses a combination of relational and NoSQL databases.
Cloud Memorystore is a high availability cache service in Google Cloud that supports both Memcached and Redis. This managed cache service can be used to improve availability of data that requires low latency access.
Cloud Spanner - add additional nodes

54
Q

Network Availability
When network connectivity is down, applications are unavailable. There are two primary ways to improve network availability?

A

Use redundant network connections
Use Premium Tier networking
Redundant network connections can be used to increase the availability of the network between an on-premises data center and Google’s data center.
One type of connection is a Dedicated Interconnect, which can be used with a minimum of 10 Gbps throughput and does not traverse the public internet.
A Dedicated Interconnect is possible when both your network and the Google Cloud network have a point of presence in a common location, such as a data center.
Partner Interconnect. When your network does not share a common point of presence with the Google Cloud network, you have the option of using a Partner Interconnect. When using a Partner Interconnect, you provision a network link between your data center and a Google network point of presence.
Data within the GCP can be transmitted among regions using the public internet or Google’s internal network. The latter is available as the Premium Network Tier, which costs more than the Standard Network Tier, which uses the public internet.

Data within the GCP can be transmitted among regions using the public internet or Google’s internal network. The latter is available as the Premium Network Tier, which costs more than the Standard Network Tier, which uses the public internet.

55
Q

How can you scale applications?

A

Scalability
Scalability is the process of adding and removing infrastructure resources to meet workload demands efficiently. Different kinds of resources have different scaling characteristics.
VMs in a managed instance group scale by adding or removing instances from the group.
Autoscaling can be configured to scale based on several attributes, including the following:

  • Average CPU utilization
  • HTTP load balancing utilization
  • Customer monitoring metrics
    Kubernetes scales pods based on load and configuration parameters.
    NoSQL databases scale horizontally, but this introduces issues around consistency.
    Relational databases can scale horizontally, but that requires server clock synchronization if strong consistency is required among all nodes.
    Cloud Spanner uses the TrueTime service, which depends on atomic clocks and GPS signals to ensure a low, upper bound on the difference in time reported by clocks in a distributed system.
56
Q

What is a GKE Deployment?

A

A deployment specifies updates for pods and ReplicaSets, which are sets of identically configured pods running at some point in time.

An application may be run in more than one deployment at a time. This is commonly done to roll out new versions of code. A new deployment can be run in a cluster, and a small amount of traffic can be sent to it to test the new code in a production environment without exposing all users to the new code.

57
Q

How can you scale managed data?

A

Managed services, such as Cloud Storage and BigQuery, ensure that storage is available as needed.
In the case of BigQuery, even if you do not scale storage directly, you may want to consider partitioning data to improve query performance. Partitioning organizes data in a way that allows the query processor to scan smaller amounts of data to answer a query.

58
Q

How do we manage reliability in GCP?

A

Reliability is a measure of the likelihood of a system being available and able to meet the needs of the load on the system. When analyzing technical requirements, it is important to look for reliability requirements.
Reliability
Reliability is a measure of the likelihood of a system being available and able to meet the needs of the load on the system. When analyzing technical requirements, it is important to look for reliability requirements. As with availability and scalability, these requirements may be explicit or implicit.

Designing for reliability requires that you consider how to minimize the chance of system failures. For example, we employ redundancy to mitigate the risk of a hardware failure leaving a crucial component unavailable. We also use DevOps best practices to manage risks with configuration changes and when managing infrastructure as code. These are the same practices that we employ to ensure availability.

59
Q

What is reliability engineering?

A

As an architect, you should consider ways to support reliability early in the design stage. This should include the following:

Identifying how to monitor services. Will they require custom metrics?

Considering alerting conditions. How do you balance the need for early indication that a problem may be emerging with the need to avoid overloading DevOps teams with unactionable alerts?

Using existing incident response procedures with the new system.

Does this system require any specialized procedures during an incident?
For example, if this is the first application to store confidential, personally identifying information, you may need to add procedures to notify the information security team if an incident involves a failure in access controls.
Implementing a system for tracking outages and performing post-mortems to understand why a disruption occurred.

60
Q

What are the differences between availability, scalability, and reliability?

A

*High availability is the continuous operation of a system at sufficient capacity to meet the demands of ongoing workloads. Availability is usually measured as a percentage of time that a system is available.

  • Scalability is the process of adding and removing infrastructure resources to meet workload demands efficiently.

*Reliability is a measure of how likely it is that a system will be available and capable of meeting the needs of the load on the system.

61
Q

Understand how redundancy is used to improve availability.

A

Compute, storage, and network services all use redundancy combined with autohealing or other forms of autorepair to improve availability.
Clusters of identically configured VMs behind a load balancer is an example of using redundancy to improve availability.
Making multiple copies of data is an example of redundancy used to improve storage availability.
Using multiple direct connections between a data center and Google Cloud is an example of redundancy in networking.

62
Q

What predefined roles are available for Monitoring?

A

Monitoring Viewer
View Monitoring data and configuration information. For example, principals with this role can view custom dashboards and alerting policies.

Monitoring Editor
View Monitoring data, and create and edit configurations. For example, principals with this role can create custom dashboards and alerting policies.

Monitoring Admin
View Monitoring data, create and edit configurations, and modify the metrics scope.

63
Q

WHy would you choose a TCP/UDP Internal load balancer?

A

First it isn’t a website, that would be HTTP
It is some type of service that has an open TCP/UDP port
Example database
Finally - not a proxy

64
Q

List different services in Kubernetes

A

kubectl get services command to list services

65
Q

How could you create a compute resource to take on a temporary job?

A

Create a cluster or node pool with preemptible VMs
You can use the Google Cloud CLI to create a cluster or node pool with preemptible VMs.

Create a cluster or node pool with preemptible VMs
You can use the Google Cloud CLI to create a cluster or node pool with preemptible VMs.

To create a cluster with preemptible VMs, run the following command:

gcloud container clusters create CLUSTER_NAME \
–preemptible
Replace CLUSTER_NAME with the name of your new cluster.

To create a node pool with preemptible VMs, run the following command:

gcloud container node-pools create POOL_NAME \
–cluster=CLUSTER_NAME \
–preemptible
Replace POOL_NAME with the name of your new node pool.

gcloud container clusters create CLUSTER_NAME \
–preemptible
Preemptible VM instances are available at much lower price—a 60-91% discount—compared to the price of standard VMs. However, Compute Engine might stop (preempt) these instances if it needs to reclaim the compute capacity for allocation to other VMs. Preemptible instances use excess Compute Engine capacity, so their availability varies with usage.

If your apps are fault-tolerant and can withstand possible instance preemptions, then preemptible instances can reduce your Compute Engine costs significantly. For example, batch processing jobs can run on preemptible instances. If some of those instances stop during processing, the job slows but does not completely stop. Preemptible instances complete your batch processing tasks without placing additional workload on your existing instances and without requiring you to pay full price for additional normal instances.

66
Q

How does a Horizontal Pod Autoscaler manager a workload in GKE?

A

The Horizontal Pod Autoscaler changes the shape of your Kubernetes workload by automatically increasing or decreasing the number of Pods in response to the workload’s CPU or memory consumption, or in response to custom metrics reported from within Kubernetes or external metrics from sources outside of your cluster.

Horizontal Pod autoscaling cannot be used for workloads that cannot be scaled, such as DaemonSets.

Overview
When you first deploy your workload to a Kubernetes cluster, you may not be sure about its resource requirements and how those requirements might change depending on usage patterns, external dependencies, or other factors. Horizontal Pod autoscaling helps to ensure that your workload functions consistently in different situations, and allows you to control costs by only paying for extra capacity when you need it.

It’s not always easy to predict the indicators that show whether your workload is under-resourced or under-utilized. The Horizontal Pod Autoscaler can automatically scale the number of Pods in your workload based on one or more metrics of the following types:

Actual resource usage: when a given Pod’s CPU or memory usage exceeds a threshold. This can be expressed as a raw value or as a percentage of the amount the Pod requests for that resource.

Custom metrics: based on any metric reported by a Kubernetes object in a cluster, such as the rate of client requests per second or I/O writes per second.

This can be useful if your application is prone to network bottlenecks, rather than CPU or memory.

External metrics: based on a metric from an application or service external to your cluster.

For example, your workload might need more CPU when ingesting a large number of requests from a pipeline such as Pub/Sub. You can create an external metric for the size of the queue, and configure the Horizontal Pod Autoscaler to automatically increase the number of Pods when the queue size reaches a given threshold, and to reduce the number of Pods when the queue size shrinks.

You can combine a Horizontal Pod Autoscaler with a Vertical Pod Autoscaler, with some limitations.

How horizontal Pod autoscaling works
Each configured Horizontal Pod Autoscaler operates using a control loop. A separate Horizontal Pod Autoscaler exists for each workflow. Each Horizontal Pod Autoscaler periodically checks a given workload’s metrics against the target thresholds you configure, and changes the shape of the workload automatically.
imitations
Do not use the Horizontal Pod Autoscaler together with the Vertical Pod Autoscaler on CPU or memory. You can use the Horizontal Pod Autoscaler with the Vertical Pod Autoscaler for other metrics.
If you have a Deployment, don’t configure horizontal Pod autoscaling on the ReplicaSet or Replication Controller backing it. When you perform a rolling update on the Deployment or Replication Controller, it is effectively replaced by a new Replication Controller. Instead configure horizontal Pod autoscaling on the Deployment itself.

67
Q

What is Dataflow SQL?

A

Dataflow SQL lets you use your SQL skills to develop streaming Dataflow pipelines right from the BigQuery web UI. You can join streaming data from Pub/Sub with files in Cloud Storage or tables in BigQuery, write results into BigQuery, and build real-time dashboards using Google Sheets or other BI tools.

68
Q

How do you write a command to create a Cloud Function

A

gcloud functions deploy helloGreeting –trigger-http –region=us-central1 –runtime=nodejs6

gcloud functions deploy <name> --runtime <enter> --trigger-topic <name>.
Once the function is deployed, we can invoke it with the data as given below:</name></enter></name>

$ gcloud functions call –data ‘{“name”:”Romin”}’ helloGreeting
executionId: 36hzafyyt8cj
result: Hello Romin

69
Q

How would you manage a requirement to create an application that performs repetitive tasks on the cloud?

A

Create a service account in IAM for the specific project.
Assign the necessary roles to the specific service account.
Create

gcloud compute instance create <instance>\
--service-account <serviceaccount>\
--scopes <provide></provide></serviceaccount></instance>

Google’s best practice is not to use the default Compute Engine service account when utilizing service accounts with a VM instance. You should create a custom service account with only the necessary permissions required. The command line offered in this example also demonstrates the necessary second step once the custom service account is created. This answer illustrates that best practices are followed.

70
Q

What is Point in Time Recovery for MySql, what does Cloud SQL -mnMySQL database use for point-in-time recovery?

A

Point-in-time recovery refers to recovery of data changes made since a given point in time. Typically, this type of recovery is performed after restoring a full backup that brings the server to its state as of the time the backup was made.
Point-in-time recovery uses binary logs. These logs update regularly and use storage space. The binary logs are automatically deleted with their associated automatic backup, which generally happens after about 7 days.

If the size of your binary logs are causing an issue for your instance:

You can increase the instance storage size, but the binary log size increase in disk usage might be temporary.

We recommend enabling automatic storage increase to avoid unexpected storage issues.

You can disable point-in-time recovery if you want to delete logs and recover storage. Decreasing the storage used does not shrink the size of the storage provisioned for the instance.

Logs are purged once daily, not continuously. Setting log retention to two days means that at least two days of logs, and at most three days of logs, are retained. We recommend setting the number of backups to one more than the days of log retention to guarantee a minimum of specified days of log retention.

71
Q

Why do you need load balancer health checks and Managed Instance Group autohealing?

A

Managed instance group health checks proactively signal to delete and recreate instances that become UNHEALTHY.
Load balancing health checks help direct traffic away from non-responsive instances and toward healthy instances; these health checks do not cause Compute Engine to recreate instances.

You need both to get the job done.