Road to Google Cloud Architect Certification Flashcards
Building for Builders LLC manufactures equipment used in residential and commercial
building. Each of its 500,000 pieces of equipment in use around the globe has IoT devices
collecting data about the state of equipment. The IoT data is streamed from each device
every 10 seconds. On average, 10 KB of data is sent in each message. The data will be used
for predictive maintenance and product development. The company would like to use a
managed database in Google Cloud. What would you recommend?
A. Apache Cassandra
B. Cloud Bigtable
C. BigQuery
D. CloudSQL
B. Option B is correct. Bigtable is the best option for streaming IoT data, since it supports
low-latency writes and is designed to scale to support petabytes of data. Option A is incorrect because Apache Cassandra is not a managed database in GCP. Option C is incorrect
because BigQuery is an analytics database. While it is a good option for analyzing the data,
Bigtable is a better option for ingesting the data. Option D is incorrect. CloudSQL is a
managed relational database. The use case does not require a relational database, and Bigtable’s scalability is a better fit with the requirements.
You have developed a web application that is becoming widely used. The frontend runs in
Google App Engine and scales automatically. The backend runs on Compute Engine in a
managed instance group. You have set the maximum number of instances in the backend
managed instance group to five. You do not want to increase the maximum size of the managed instance group or change the VM instance type, but there are times the frontend sends
more data than the backend can keep up with and data is lost. What can you do to prevent
the loss of data?
A. Use an unmanaged instance group
B. Store ingested data in Cloud Storage
C. Have the frontend write data to a Cloud Pub/Sub topic, and have the backend read
from that topic
D. Store ingested data in BigQuery
C. The correct answer is C. A Cloud Pub/Sub topic would decouple the frontend and
backend, provide a managed and scalable message queue, and store ingested data until the
backend can process it. Option A is incorrect. Switching to an unmanaged instance group
will mean that the instance group cannot autoscale. Option B is incorrect. You could store
ingested data in Cloud Storage, but it would not be as performant as the Cloud Pub/Sub
solution. Option D is incorrect because BigQuery is an analytics database and not designed
for this use case.
You are setting up a cloud project and want to assign members of your team different permissions. What GCP service would you use to do that?
A. Cloud Identity
B. Identity and Access Management (IAM)
C. Cloud Authorizations
D. LDAP
B. The correct answer is B. IAM is used to manage roles and permissions. Option A is
incorrect. Cloud Identity is a service for creating and managing identities. Option C is
incorrect. There is no GCP service with that name at this time. Option D is incorrect.
LDAP is not a GCP service.
You would like to run a custom container in a managed Google Cloud Service. What are
your two options?
A. App Engine Standard and Kubernetes Engine
B. App Engine Flexible and Kubernetes Engine
C. Compute Engine and Kubernetes Engine
D. Cloud Functions and App Engine Flexible
B. The correct answer is B. You can run custom containers in App Engine Flexible and
Kubernetes Engine. Option A is incorrect because App Engine Standard does not support
custom containers. Option C is incorrect because Compute Engine is not a managed service. Option D is incorrect because Cloud Functions does not support custom containers.
PhotosForYouToday prints photographs and ships them to customers. The frontend application uploads photos to Cloud Storage. Currently, the backend runs a cron job that checks
Cloud Storage buckets every 10 minutes for new photos. The product manager would like
to process the photos as soon as they are uploaded. What would you use to cause processing to start when a photo file is saved to Cloud Storage?
A Cloud Function
B. An App Engine Flexible application
C. A Kubernetes pod
D. A cron job that checks the bucket more frequently
A. The correct answer is A. A Cloud Function can respond to a create file event in Cloud
Storage and start processing when the file is created. Option B is incorrect because an
App Engine Flexible application cannot respond to a Cloud Storage write event. Option
C is incorrect. Kubernetes pods are the smallest compute unit in Kubernetes and are not
designed to respond to Cloud Storage events. Option D is incorrect because it does not
guarantee that photos will be processed as soon as they are created.
The chief financial officer of your company believes that you are spending too much money
to run an on-premises data warehouse and wants to migrate to a managed cloud solution.
What GCP service would you recommend for implementing a new data warehouse in GCP?
A. Compute Engine
B. BigQuery
C. Cloud Dataproc
D. Cloud Bigtable
B. The correct answer is B. BigQuery is a managed analytics database designed to support
data warehouses and similar use cases. Option A is incorrect. Compute Engine is not a
managed service. Option C is incorrect. Cloud Dataproc is a managed Hadoop and Spark
service. Option D is incorrect. Bigtable is a NoSQL database well suited for large-volume,
low-latency writes and limited ranges of queries. It is not suitable for the kind of ad hoc
querying commonly done with data warehouses.
A government regulation requires you to keep certain financial data for seven years. You
are not likely to ever retrieve the data, and you are only keeping it to be in compliance.
There are approximately 500 GB of financial data for each year that you are required to
save. What is the most cost-effective way to store this data?
A. Cloud Storage multiregional storage
B. Cloud Storage Nearline storage
C. Cloud Storage Coldline storage
D. Cloud Storage persistent disk storage
C. The correct answer is C. Cloud Storage Coldline is the lowest-cost option, and it is
designed for data that is accessed less than once per year. Option A and Option B are incorrect because they cost more than Coldline storage. Option D is incorrect because there is no
such service.
Global Games Enterprises Inc. is expanding from North America to Europe. Some of the
games offered by the company collect personal information. With what additional regulation will the company need to comply when it expands into the European market?
A. HIPAA
B. PCI-DS
C. GDPR
D. SOX
C. The correct answer is C. The GDPR is a European Union directive protecting the personal information of EU citizens. Option A is incorrect. HIPAA is a U.S. healthcare regulation. Option B is incorrect. PCI-DS is a payment card data security regulation; if Global
Games Enterprises Inc. is accepting payment cards in North America, it is already subject
to that regulation. Option D is a U.S. regulation on some publicly traded companies; the
company may be subject to that regulation already, and expanding to Europe will not
change its status.
Your team is developing a Tier 1 application for your company. The application will depend
on a PostgreSQL database. Team members do not have much experience with PostgreSQL
and want to implement the database in a way that minimizes their administrative responsibilities for the database. What managed service would you recommend?
A. Cloud SQL
B. Cloud Dataproc
C. Cloud Bigtable
D. Cloud PostgreSQL
A. The correct answer is A. Cloud SQL is a managed database service that supports PostgreSQL. Option B is incorrect. Cloud Dataproc is a managed Hadoop and Spark service.
Option C is incorrect. Cloud Bigtable is a NoSQL database. Option D is incorrect. There is
no service called Cloud PostgreSQL in GCP at this time
What is a service-level indicator?
A. A metric collected to indicate how well a service-level objective is being met
B. A type of log
C. A type of notification sent to a sysadmin when an alert is triggered
D. A visualization displayed when a VM instance is down
A. The correct answer is A. A service-level indicator is a metric used to measure how well
a service is meeting its objectives. Options B and C are incorrect. It is not a type of log or a
type of notification. Option D is incorrect. A service-level indicator is not a visualization,
although the same metrics may be used to drive the display of a visualization.
Developers at MakeYouFashionable have adopted agile development methodologies. Which
tool might they use to support CI/CD?
A. Google Docs
B. Jenkins
C. Apache Cassandra
D. Clojure
B. The correct answer is B. Jenkins is a popular CI/CD tool. Option A is incorrect. Google
Docs is a collaboration tool for creating and sharing documents. Option C is incorrect.
Cassandra is a NoSQL database. Option D is incorrect. Clojure is a Lisp-like programming
language that runs on the Java virtual machine (JVM).
You have a backlog of audio files that need to be processed using a custom application.
The files are stored in Cloud Storage. If the files were processed continuously on three
n1-standard-4 instances, the job could complete in two days. You have 30 days to deliver
the processed files, after which they will be sent to a client and deleted from your systems.
You would like to minimize the cost of processing. What might you do to help keep costs
down?
A. Store the files in coldline storage
B. Store the processed files in multiregional storage
C. Store the processed files in Cloud CDN
D. Use preemptible VMs
D. The correct answer is D. Use preemptible VMs, which cost significantly less than
standard VMs. Option A is incorrect. Coldline storage is not appropriate for files that are
actively used. Option B is incorrect. Storing files in multiregional storage will cost more
than regional storage, and there is no indication from the requirements that they should be
stored multiregionally. Option C is incorrect. There is no indication that the processed files
need to be distributed to a global user base.
You have joined a startup selling supplies to visual artists. One element of the company’s
strategy is to foster a social network of artists and art buyers. The company will provide
e-commerce services for artists and earn revenue by charging a fee for each transaction.
You have been asked to collect more detailed business requirements. What might you
expect as an additional business requirement?
A. The ability to ingest streaming data
B. A recommendation system to match buyers to artists
C. Compliance with SOX regulations
D. Natural language processing of large volumes of text
B. The correct answer is B. This is an e-commerce site matching sellers and buyers, so a
system that recommends artists to buyers can help increase sales. Option A is incorrect.
There is no indication of any need for streaming data. Option C is incorrect. This is a
startup, and it is not likely subject to SOX regulations. Option D is incorrect. There is no
indication of a need to process large volumes of text
You work for a manufacturer of specialty die cast parts for the aerospace industry. The
company has built a reputation as the leader in high-quality, specialty die cast parts, but
recently the number of parts returned for poor quality is increasing. Detailed data about the
manufacturing process is collected throughout every stage of manufacturing. To date, the
data has been collected and stored but not analyzed. There are a total of 20 TB of data. The
company has a team of analysts familiar with spreadsheets and SQL. What service might
you recommend for conducting preliminary analysis of the data?
A. Compute Engine
B. Kubernetes Engine
C. BigQuery
D. Cloud Functions
C. The correct answer is C. BigQuery is an analytics database that supports SQL. Options
A and B are incorrect because, although they could be used to run analytics applications,
such as Apache Hadoop or Apache Spark, it would require more administrative overhead.
Also, the team members working on this are analysts, but there is no indication that they
have the skills or desire to manage analytics platforms. Option D is incorrect. Cloud Functions is for running short programs in response to events in GCP.
A client of yours wants to run an application in a highly secure environment. They want to
use instances that will only run boot components verified by digital signatures. What would
you recommend they use in Google Cloud?
A. Preemptible VMs
B. Managed instance groups
C. Cloud Functions
D. Shielded VMs
The correct answer is D.
Shielded VMs include secure boot, which only runs digitally verified boot components. Option A is incorrect. Preemptible VMs are interruptible
instances, but they cost less than standard VMs. Option B is incorrect. Managed instance
groups are sets of identical VMs that are managed as a single entity. Option C is incorrect.
Cloud Functions is a PaaS for running programs in response to events in GCP.
You have installed the Google Cloud SDK. You would now like to work on transferring
files to Cloud Storage. What command-line utility would you use?
A. bq
B. gsutil
C. cbt
D. gcloud
B. The correct answer is B. gsutil is the command-line utility for working with Cloud
Storage. Option A is incorrect. bq is the command-line utility for working with BigQuery.
Option C is incorrect. cbt is the command-line utility for working with Cloud Bigtable.
Option D is incorrect. gcloud is used to work with most GCP services but not Cloud
Storage.
Kubernetes pods sometimes need access to persistent storage. Pods are ephemeral—they
may shut down for reasons not in control of the application running in the pod. What
mechanism does Kubernetes use to decouple pods from persistent storage?
A. PersistentVolumes
B. Deployments
C. ReplicaSets
D. Ingress
A. The correct answer is A. PersistentVolumes is Kubernetes’ way of representing storage
allocated or provisioned for use by a pod. Option B is incorrect. Deployments are a type
of controller consisting of pods running the same version of an application. Option C
is incorrect. A ReplicaSet is a controller that manages the number of pods running in a
deployment. Option D is incorrect. An Ingress is an object that controls external access to
services running in a Kubernetes cluster.
An application that you support has been missing service-level objectives, especially around
database query response times. You have reviewed monitoring data and determined that
a large number of database read operations is putting unexpected load on the system. The
database uses MySQL, and it is running in Compute Engine. You have tuned SQL queries,
and the performance is still not meeting objectives. Of the following options, which would
you try next?
A. Migrate to a NoSQL database.
B. Move the database to Cloud SQL.
C. Use Cloud Memorystore to cache data read from the database to reduce the number of
reads on the database.
D. Move some of the data out of the database to Cloud Storage.
C. The correct answer is C. Use Cloud Memorystore to reduce the number of reads against
the database. Option A is incorrect. The application is designed to work with a relational
database, and there is no indication that a NoSQL database is a better option overall.
Option B is incorrect. Simply moving the database to a managed service will not change the
number of read operations, which is the cause of the poor performance. Option D is incorrect. Moving data to Cloud Storage will not reduce the number of reads
You are running a complicated stream processing operation using Apache Beam. You want
to start using a managed service. What GCP service would you use?
A. Cloud Dataprep
B. Cloud Dataproc
C. Cloud Dataflow
D. Cloud Identity
C. The correct answer is C. Cloud Dataflow is an implementation of the Apache Beam
stream processing framework. Cloud Dataflow is a fully managed service. Option A is
incorrect. Cloud Dataprep is used to prepare data for analysis. Option B is incorrect. Cloud
Dataproc is a managed Hadoop and Spark service. Option D is incorrect. Cloud Identity is
an authentication service.
Your team has had a number of incidents in which Tier 1 and Tier 2 services were down for
more than 1 hour. After conducting a few retrospective analyses of the incidents, you have
determined that you could identify the causes of incidents faster if you had a centralized log
repository. What GCP service could you use for this?
A. Stackdriver Logging
B. Cloud Logging
C. Cloud SQL
D. Cloud Bigtable
A. The correct answer is A. Stackdriver Logging is a centralized logging service. Option
B is incorrect. There is no such service at this time. Option C and Option D are incorrect
because those are databases and not specifically designed to support the logging of the use
case described.
A Global 2000 company has hired you as a consultant to help architect a new logistics system. The system will track the location of parts as they are shipped between company facilities in Europe, Africa, South America, and Australia. Anytime a user queries the database,
they must receive accurate and up-to-date information; specifically, the database must support strong consistency. Users from any facility may query the database using SQL. What
GCP service would you recommend?
A. Cloud SQL
B. BigQuery
C. Cloud Spanner
D. Cloud Dataflow
C. The correct answer is C. Cloud Spanner is a globally scalable, strongly consistent relational database that can be queried using SQL. Option A is incorrect because it will not
scale to the global scale as Cloud Spanner will. Option B is incorrect. The requirements
describe an application that will likely have frequent updates and transactions. BigQuery
is designed for analytics and data warehousing. Option D is incorrect. Cloud Dataflow is a
stream and batch processing service.
A database architect for a game developer has determined that a NoSQL document
database is the best option for storing players’ possessions. What GCP service would you
recommend?
A. Cloud Datastore
B. Cloud Storage
C. Cloud Dataproc
D. Cloud Bigtable
A. The correct answer is A. Cloud Datastore is a managed document NoSQL database
in GCP. Option B is incorrect. Cloud Storage is an object storage system, not a document
NoSQL database. Option C is incorrect. Cloud Dataproc is a managed Hadoop and Spark
service. Option D is incorrect. Cloud Bigtable is a wide-column NoSQL database, not a
document database.
A major news agency is seeing increasing readership across the globe. The CTO is concerned that long page-load times will decrease readership. What might the news agency try
to reduce the page-load time of readers around the globe?
A. Regional Cloud Storage
B. Cloud CDN
C. Fewer firewall rules
D. Virtual private network
B. The correct answer is B. Cloud CDN is GCP’s content delivery network, which distributes static content globally. Option A is incorrect. Reading from regional storage can still
have long latencies for readers outside of the region. Option C is incorrect. Firewall rules do
not impact latency in any discernible way. Option D is incorrect. VPNs are used to link onpremises networks to Google Cloud
What networking mechanism allows different VPC networks to communicate using private
IP address space, as defined in RFC 1918?
A. ReplicaSets
B. Custom subnets
C. VPC network peering
D. Firewall rules
C. The correct answer is C. VPC peering allows different VPCs to communicate using
private networks. Option A is incorrect. ReplicaSets are used in Kubernetes; they are not
related to VPCs. Option B is incorrect. Custom subnets define network address ranges for
regions. Option D is incorrect. Firewall rules control the flow of network traffic.
You have been tasked with setting up disaster recovery infrastructure in the cloud that will
be used if the on-premises data center is not available. What network topology would you
use for a disaster recovery environment?
A. Meshed topology
B. Mirrored topology
C. Gated egress topology
D. Gated ingress topology
B. The correct answer is B. With a mirrored topology, the public cloud and private onpremise environments mirror each other. Option A is incorrect. In a mesh topology, all
systems in the cloud and private networks can communicate with each other. Option C is
incorrect. In a gated egress topology, on-premises service APIs are made available to applications running in the cloud without exposing them to the public Internet. Option D is
incorrect. In a gated ingress topology, cloud service APIs are made available to applications
running on-premises without exposing them to the public Internet.
For this question, refer to the TerramEarth case study. https://cloud.google.com/certification/guides/cloud-architect/casestudy-terramearth-rev2
Because you do not know every possible future use for the data TerramEarth collects, you have decided to build a system that captures and stores all raw data in case you need it later. How can you most cost-effectively accomplish this goal?
A. Have the vehicles in the field stream the data directly into BigQuery.
B. Have the vehicles in the field pass the data to Cloud Pub/Sub and dump it into a Cloud Dataproc cluster that stores data in Apache Hadoop Distributed File System (HDFS) on persistent disks.
C. Have the vehicles in the field continue to dump data via FTP, adjust the existing Linux machines, and use a collector to upload them into Cloud Dataproc HDFS for storage.
D. Have the vehicles in the field continue to dump data via FTP, and adjust the existing Linux machines to immediately upload it to Cloud Storage with gsutil.
D. Have the vehicles in the field continue to dump data via FTP, and adjust the existing Linux machines to immediately upload it to Cloud Storage with gsutil.
Commentaire
A is not correct because TerramEarth has cellular service for 200,000 vehicles, and each vehicle sends at least one row (120 fields) per second. This exceeds BigQuery’s maximum rows per second per project quota. Additionally, there are 20 million total vehicles, most of which perform uploads when connected by a maintenance port, which drastically exceeds the streaming project quota further.
B is not correct because although Cloud Pub/Sub is a fine choice for this application, Cloud Dataproc is probably not. The question posed asks us to optimize for cost. Because Cloud Dataproc is optimized for ephemeral, job-scoped clusters, a long-running cluster with large amounts of HDFS storage could be very expensive to build and maintain when compared to managed and specialized storage solutions like Cloud Storage.
C is not correct because the question asks us to optimize for cost, and because Cloud Dataproc is optimized for ephemeral, job-scoped clusters, a long-running cluster with large amounts of HDFS storage could be very expensive to build and maintain when compared to managed and specialized storage solutions like Cloud Storage.
D is correct because several load-balanced Compute Engine VMs would suffice to ingest 9 TB per day, and Cloud Storage is the cheapest per-byte storage offered by Google. Depending on the format, the data could be available via BigQuery immediately, or shortly after running through an ETL job. Thus, this solution meets business and technical requirements while optimizing for cost.
https://cloud.google.com/blog/products/data-analytics/10-tips-for-building-long-running-clusters-using-cloud-dataproc
https://cloud.google.com/blog/products/data-analytics/10-tips-for-building-long-running-clusters-using-cloud-dataproc
https://cloud.google.com/bigquery/quotas#streaming_inserts
For this question, refer to the TerramEarth case study. https://cloud.google.com/certification/guides/cloud-architect/casestudy-terramearth-rev2
Today, TerramEarth maintenance workers receive interactive performance graphs for the last 24 hours (86,400 events) by plugging their maintenance tablets into the vehicle. The support group wants support technicians to view this data remotely to help troubleshoot problems. You want to minimize the latency of graph loads. How should you provide this functionality?
A. Execute queries against data stored in a Cloud SQL.
B. Execute queries against data indexed by vehicle_id.timestamp in Cloud Bigtable.
C. Execute queries against data stored on daily partitioned BigQuery tables.
D. Execute queries against BigQuery with data stored in Cloud Storage via BigQuery federation.
A is not correct because Cloud SQL provides relational database services that are well suited to OLTP workloads, but not storage and low-latency retrieval of time-series data.
B is correct because Cloud Bigtable is optimized for time-series data. It is cost-efficient, highly available, and low-latency. It scales well. Best of all, it is a managed service that does not require significant operations work to keep running.
C is not correct because BigQuery is fast for wide-range queries, but it is not as well optimized for narrow-range queries as Cloud Bigtable is. Latency will be an order of magnitude shorter with Cloud Bigtable for this use.
D is not correct because the objective is to minimize latency, and although BigQuery federation offers tremendous flexibility, it doesn’t perform as well as native BigQuery storage, and will have longer latency than Cloud Bigtable for narrow-range queries.
https://cloud.google.com/bigquery/external-data-sources
https://cloud.google.com/bigtable/docs/schema-design-time-series#time-series-cloud-bigtable
For this question, refer to the TerramEarth case study. https://cloud.google.com/certification/guides/cloud-architect/casestudy-terramearth-rev2
Your agricultural division is experimenting with fully autonomous vehicles. You want your architecture to promote strong security during vehicle operation. Which two architecture characteristics should you consider? (choose two)
A. Use multiple connectivity subsystems for redundancy.
B. Require IPv6 for connectivity to ensure a secure address space.
C. Enclose the vehicle’s drive electronics in a Faraday cage to isolate chips.
D. Use a functional programming language to isolate code execution cycles.
E. Treat every microservice call between modules on the vehicle as untrusted.
F. Use a Trusted Platform Module (TPM) and verify firmware and binaries on boot.
Bonne réponse
E. Treat every microservice call between modules on the vehicle as untrusted.
F. Use a Trusted Platform Module (TPM) and verify firmware and binaries on boot.
Commentaire
A is not correct because this improves system durability, but it doesn’t have any impact on the security during vehicle operation.
B is not correct because IPv6 doesn’t have any impact on the security during vehicle operation, although it improves system scalability and simplicity.
C is not correct because it doesn’t have any impact on the security during vehicle operation, although it improves system durability.
D is not correct because merely using a functional programming language doesn’t guarantee a more secure level of execution isolation. Any impact on security from this decision would be incidental at best.
E is correct because this improves system security by making it more resistant to hacking, especially through man-in-the-middle attacks between modules.
F is correct because this improves system security by making it more resistant to hacking, especially rootkits or other kinds of corruption by malicious actors.
https://en.wikipedia.org/wiki/Trusted_Platform_Module
For this question, refer to the TerramEarth case study. https://cloud.google.com/certification/guides/cloud-architect/casestudy-terramearth-rev2
Which of TerramEarth’s legacy enterprise processes will experience significant change as a result of increased Google Cloud Platform adoption?
A. OpEx/CapEx allocation, LAN change management, capacity planning
B. Capacity planning, TCO calculations, OpEx/CapEx allocation
C. Capacity planning, utilization measurement, data center expansion
D. Data center expansion,TCO calculations, utilization measurement
Commentaire
A is not correct because LAN change management processes don’t need to change significantly. TerramEarth can easily peer their on-premises LAN with their Google Cloud Platform VPCs, and as devices and subnets move to the cloud, the LAN team’s implementation will change, but the change management process doesn’t have to.
B is correct because all of these tasks are big changes when moving to the cloud. Capacity planning for cloud is different than for on-premises data centers; TCO calculations are adjusted because TerramEarth is using services, not leasing/buying servers; OpEx/CapEx allocation is adjusted as services are consumed vs. using capital expenditures.
C is not correct because measuring utilization can be done in the same way, often with the same tools (along with some new ones). Data center expansion is not a concern for cloud customers; it is part of the undifferentiated heavy lifting that is taken care of by the cloud provider.
D is not correct because data center expansion is not a concern for cloud customers; it is part of the undifferentiated heavy lifting that is taken care of by the cloud provider. Measuring utilization can be done in the same way, often with the same tools (along with some new ones).
https://assets.kpmg/content/dam/kpmg/pdf/2015/11/cloud-economics.pdf
For this question, refer to the TerramEarth case study. https://cloud.google.com/certification/guides/cloud-architect/casestudy-terramearth-rev2
You analyzed TerramEarth’s business requirement to reduce downtime and found that they can achieve a majority of time saving by reducing customers’ wait time for parts. You decided to focus on reduction of the 3 weeks’ aggregate reporting time. Which modifications to the company’s processes should you recommend?
A. Migrate from CSV to binary format, migrate from FTP to SFTP transport, and develop machine learning analysis of metrics.
B. Migrate from FTP to streaming transport, migrate from CSV to binary format, and develop machine learning analysis of metrics.
C. Increase fleet cellular connectivity to 80%, migrate from FTP to streaming transport, and develop machine learning analysis of metrics.
D. Migrate from FTP to SFTP transport, develop machine learning analysis of metrics, and increase dealer local inventory by a fixed factor.
Bonne réponse
C. Increase fleet cellular connectivity to 80%, migrate from FTP to streaming transport, and develop machine learning analysis of metrics.
Commentaire
A is not correct because machine learning analysis is a good means toward the end of reducing downtime, but shuffling formats and transport doesn’t directly help at all.
B is not correct because machine learning analysis is a good means toward the end of reducing downtime, and moving to streaming can improve the freshness of the information in that analysis, but changing the format doesn’t directly help at all.
C is correct because using cellular connectivity will greatly improve the freshness of data used for analysis from where it is now, collected when the machines are in for maintenance. Streaming transport instead of periodic FTP will tighten the feedback loop even more. Machine learning is ideal for predictive maintenance workloads.
D is not correct because machine learning analysis is a good means toward the end of reducing downtime, but the rest of these changes don’t directly help at all.
Your company wants to deploy several microservices to help their system handle elastic loads. Each microservice uses a different version of software libraries. You want to enable their developers to keep their development environment in sync with the various production services. Which technology should you choose?
A. RPM/DEB
B. Containers
C. Chef/Puppet
D. Virtual machines
A is not correct because although OS packages are a convenient way to distribute and deploy libraries, they don’t directly help with synchronizing. Even with a common repository, the development environments will probably deviate from production.
B is correct because using containers for development, test, and production deployments abstracts away system OS environments, so that a single host OS image can be used for all environments. Changes that are made during development are captured using a copy on-write filesystem, and teams can easily publish new versions of the microservices in a repository.
C is not correct because although infrastructure configuration as code can help unify production and test environments, it is very difficult to make all changes during development this way.
D is not correct because virtual machines run their own OS, which will eventually deviate in each environment, just as now.
Your company wants to track whether someone is present in a meeting room reserved for a scheduled meeting. There are 1000 meeting rooms across 5 offices on 3 continents. Each room is equipped with a motion sensor that reports its status every second. You want to support the data ingestion needs of this sensor network. The receiving infrastructure needs to account for the possibility that the devices may have inconsistent connectivity. Which solution should you design?
A. Have each device create a persistent connection to a Compute Engine instance and write messages to a custom application.
B. Have devices poll for connectivity to Cloud SQL and insert the latest messages on a regular interval to a device specific table.
C. Have devices poll for connectivity to Cloud Pub/Sub and publish the latest messages on a regular interval to a shared topic for all devices.
D. Have devices create a persistent connection to an App Engine application fronted by Cloud Endpoints, which ingest messages and write them to Cloud Datastore.
A is not correct because having a persistent connection does not handle the case where the device is disconnected.
B is not correct because Cloud SQL is a regional, relational database and not the best fit for sensor data. Additionally, the frequency of the writes has the potential to exceed the supported number of concurrent connections.
C is correct because Cloud Pub/Sub can handle the frequency of this data, and consumers of the data can pull from the shared topic for further processing.
D is not correct because having a persistent connection does not handle the case where the device is disconnected.
https://cloud.google.com/sql/
https://cloud.google.com/pubsub/
Your company wants to try out the cloud with low risk. They want to archive approximately 100 TB of their log data to the cloud and test the serverless analytics features available to them there, while also retaining that data as a long-term disaster recovery backup. Which two steps should they take? (choose two)
A. Load logs into BigQuery.
B. Load logs into Cloud SQL.
C. Import logs into Cloud Logging.
D. Insert logs into Cloud Bigtable.
E. Upload log files into Cloud Storage.
A is correct because BigQuery is a serverless warehouse for analytics and supports the volume and analytics requirement.
B is not correct because Cloud SQL does not support the expected 100 TB. Additionally, Cloud SQL is a relational database and not the best fit for time-series log data formats.
C is not correct because Cloud Logging is optimized for monitoring, error reporting, and debugging instead of analytics queries.
D is not correct because Cloud Bigtable is optimized for read-write latency and analytics throughput, not analytics querying and reporting.
E is correct because Cloud Storage provides the Coldline and Archive storage classes to support long-term storage with infrequent access, which would support the long-term disaster recovery backup requirement.
https://cloud.google.com/storage/docs/storage-classes#coldline
https://cloud.google.com/bigtable/
https://cloud.google.com/products/operations
https://cloud.google.com/sql/
https://cloud.google.com/bigquery/
You set up an autoscaling managed instance group to serve web traffic for an upcoming launch. After configuring the instance group as a backend service to an HTTP(S) load balancer, you notice that virtual machine (VM) instances are being terminated and re-launched every minute. The instances do not have a public IP address. You have verified that the appropriate web response is coming from each instance using the curl command. You want to ensure that the backend is configured correctly. What should you do?
A. Ensure that a firewall rule exists to allow source traffic on HTTP/HTTPS to reach the load balancer.
B. Assign a public IP to each instance, and configure a firewall rule to allow the load balancer to reach the instance public IP.
C. Ensure that a firewall rule exists to allow load balancer health checks to reach the instances in the instance group.
D. Create a tag on each instance with the name of the load balancer. Configure a firewall rule with the name of the load balancer as the source and the instance tag as the destination.
A is not correct because the issue to resolve is the VMs being terminated, not access to the load balancer.
B is not correct because this introduces a security vulnerability without addressing the primary concern of the VM termination.
C is correct because health check failures lead to a VM being marked unhealthy and can result in termination if the health check continues to fail. Because you have already verified that the instances are functioning properly, the next step would be to determine why the health check is continuously failing.
D is not correct because the source of the firewall rule that allows load balancer and health check access to instances is defined IP ranges, and not a named load balancer. Tagging the instances for the purpose of firewall rules is appropriate but would probably be a descriptor of the application, and not the load balancer.
https://cloud.google.com/load-balancing/docs/https/
https://cloud.google.com/load-balancing/docs/health-check-concepts
You are designing a large distributed application with 30 microservices. Each of your distributed microservices needs to connect to a database backend. You want to store the credentials securely. Where should you store the credentials?
A. In the source code
B. In an environment variable
C. In a secret management system
D. In a config file that has restricted access through ACLs
A is not correct because storing credentials in source code and source control is discoverable, in plain text, by anyone with access to the source code. This also introduces the requirement to update code and do a deployment each time the credentials are rotated.
B is not correct because consistently populating environment variables would require the credentials to be available, in plain text, when the session is started.
C is correct because a secret management system such as Secret Manager is a secure and convenient storage system for API keys, passwords, certificates, and other sensitive data. Secret Manager provides a central place and single source of truth to manage, access, and audit secrets across Google Cloud.
D is not correct because instead of managing access to the config file and updating manually as keys are rotated, it would be better to leverage a key management system. Additionally, there is increased risk if the config file contains the credentials in plain text.
https://cloud.google.com/kubernetes-engine/docs/concepts/secret
https://cloud.google.com/secret-manager
Mountkirk Games wants to set up a real-time analytics platform for their new game. The new platform must meet their technical requirements. Which combination of Google technologies will meet all of their requirements?
A. Kubernetes Engine, Cloud Pub/Sub, and Cloud SQL
B. Cloud Dataflow, Cloud Storage, Cloud Pub/Sub, and BigQuery
C. Cloud SQL, Cloud Storage, Cloud Pub/Sub, and Cloud Dataflow
D. Cloud Dataproc, Cloud Pub/Sub, Cloud SQL, and Cloud Dataflow
E. Cloud Pub/Sub, Compute Engine, Cloud Storage, and Cloud Dataproc
A is not correct because Cloud SQL is the only storage listed, is limited to 10 TB of storage, and is better suited for transactional workloads. Mountkirk Games needs queries to access at least 30,720 GB of historical data for analytic purposes.
B is correct because:
-Cloud Dataflow dynamically scales up or down, can process data in real time, and is ideal for processing data that arrives late using Beam windows and triggers.
-Cloud Storage can be the landing space for files that are regularly uploaded by users’ mobile devices.
-Cloud Pub/Sub can ingest the streaming data from the mobile users.
BigQuery can query more than 10 TB of historical data.
C is not correct because Cloud SQL is the only storage listed, is limited to 30,720 GB of storage, and is better suited for transactional workloads. Mountkirk Games needs queries to access at least 10 TB of historical data for analytic purposes.
D is not correct because Cloud SQL is limited to 30,720 GB of storage and is better suited for transactional workloads. Mountkirk Games needs queries to access at least 10 TB of historical data for analytics purposes.
E is not correct because Mountkirk Games needs the ability to query historical data. While this might be possible using workarounds, such as BigQuery federated queries for Cloud Storage or Hive queries for Cloud Dataproc, these approaches are more complex. BigQuery is a simpler and more flexible product that fulfills those requirements.
https://cloud.google.com/sql/docs/quotas#fixed-limits
https://beam.apache.org/documentation/programming-guide/#windowing
https://beam.apache.org/documentation/programming-guide/#triggers
https://cloud.google.com/bigquery/external-data-sources
For this question, refer to the Mountkirk Games case study. https://cloud.google.com/certification/guides/cloud-architect/casestudy-mountkirkgames-rev2
Mountkirk Games has deployed their new backend on Google Cloud Platform (GCP). You want to create a thorough testing process for new versions of the backend before they are released to the public. You want the testing environment to scale in an economical way. How should you design the process?
A. Create a scalable environment in Google Cloud for simulating production load.
B. Use the existing infrastructure to test the Google Cloud-based backend at scale.
C. Build stress tests into each component of your application and use resources from the already deployed production backend to simulate load.
D. Create a set of static environments in Google Cloud to test different levels of load—for example, high, medium, and low.
A is correct because simulating production load in Google Cloud can scale in an economical way.
B is not correct because one of the pain points about the existing infrastructure was precisely that the environment did not scale well.
C is not correct because it is a best practice to have a clear separation between test and production environments. Generating test load should not be done from a production environment.
D is not correct because Mountkirk Games wants the testing environment to scale as needed. Defining several static environments for specific levels of load goes against this requirement.
https://cloud.google.com/community/tutorials/load-testing-iot-using-gcp-and-locust
https://github.com/GoogleCloudPlatform/distributed-load-testing-using-kubernetes
For this question, refer to the Mountkirk Games case study. https://cloud.google.com/certification/guides/cloud-architect/casestudy-mountkirkgames-rev2
Mountkirk Games wants to set up a continuous delivery pipeline. Their architecture includes many small services that they want to be able to update and roll back quickly. Mountkirk Games has the following requirements: (1) Services are deployed redundantly across multiple regions in the US and Europe, (2) Only frontend services are exposed on the public internet, (3) They can reserve a single frontend IP for their fleet of services, and (4) Deployment artifacts are immutable. Which set of products should they use?
A. Cloud Storage, Cloud Dataflow, Compute Engine
B. Cloud Storage, App Engine, Cloud Load Balancing
C. Container Registry, Google Kubernetes Engine, Cloud Load Balancing
D. Cloud Functions, Cloud Pub/Sub, Cloud Deployment Manager
A is not correct because Mountkirk Games wants to set up a continuous delivery pipeline, not a data processing pipeline. Cloud Dataflow is a fully managed service for creating data processing pipelines.
B is not correct because a Cloud Load Balancer distributes traffic to Compute Engine instances. App Engine and Cloud Load Balancer are parts of different solutions.
C is correct because:
-Google Kubernetes Engine is ideal for deploying small services that can be updated and rolled back quickly. It is a best practice to manage services using immutable containers. -Cloud Load Balancing supports globally distributed services across multiple regions. It provides a single global IP address that can be used in DNS records. Using URL Maps, the requests can be routed to only the services that Mountkirk wants to expose. -Container Registry is a single place for a team to manage Docker images for the services.
D is not correct because you cannot reserve a single frontend IP for cloud functions. When deployed, an HTTP-triggered cloud function creates an endpoint with an automatically assigned IP.
https://cloud.google.com/sql/docs/quotas#fixed-limits
https://beam.apache.org/documentation/programming-guide/#windowing
https://beam.apache.org/documentation/programming-guide/#triggers
https://cloud.google.com/bigquery/external-data-sources
https://cloud.google.com/solutions/using-apache-hive-on-cloud-dataproc