Sample questions Flashcards
Your team works on a smart city project with wireless sensor networks and a set of gateways for transmitting sensor data. You have to cope with many design choices. You want, for each of the problems under study, to find the simplest solution.
For example, it is necessary to decide on the placement of nodes so that the result is the most economical and inclusive. An algorithm without data tagging must be used.
Which of the following choices do you think is the most suitable?
A. Create a Tensorflow model using Matrix factorization
B. Use Recommendations AI
C. Import the Product Catalog
D. Record / Import User events
Correct answers: B
Q-learning is an RL Reinforcement Learning algorithm. RL provides a software agent that evaluates possible solutions through a progressive reward in repeated attempts. It does not need to provide labels. But it requires a lot of data and several trials and the possibility to evaluate the validity of each attempt.
The main RL algorithms are deep Q-network (DQN) and deep deterministic policy gradient (DDPG).
Your client has an e-commerce site for commercial spare parts for cars with competitive prices. It started with the small car sector but is continually adding products. Since 80% of them operate in a B2B market, he wants to ensure that his customers are encouraged to use the new products that he gradually offers on the site quickly and profitably.
Which GCP service can be valuable in this regard and in what way?
Correct answers: B
Recommendations AI is a ready-to-use service for all the requirements shown in the question. You don’t need to create models, tune, train, all that is done by the service with your data. Also, the delivery is automatically done, with high-quality recommendations via web, mobile, email. So, it can be used directly on websites during user sessions.
You are working on an NLP model. So, you are dealing with words and sentences, not numbers. Your problem is to categorize these words and make sense of them. Your manager told you that you have to use embeddings.
Which of the following techniques are not related to embeddings?
A. Count Vector
B. TF-IDF Vector
C. Co-Occurrence Matrix
D. CoVariance Matrix
Correct Answer: D
Covariance matrices are square matrices with the covariance between each pair of elements.
It measures how much the change of one with respect to another is related.
You are a junior Data Scientist and are working on a deep neural network model with Tensorflow to optimize the level of customer satisfaction for after-sales services with the goal of creating greater client loyalty.
You are struggling with your model (learning rates, hidden layers and nodes selection) for optimizing processing and to let it converge in the fastest way.
Which is your problem, in ML language?
A. Cross-Validation
B. Regularization
C. Hyperparameter tuning
D. drift detection management
Correct Answer: C
ML training Manages three main data categories:
Training data also called examples or records. It is the main input for model configuration and, in supervised learning, presents labels, that is the correct answers based on past experience. Input data is used to build the model but will not be part of the model.
Parameters are Instead the variables to be found to solve the riddle. They are part of the final model and they make the difference among similar models of the same type.
Hyperparameters are configuration variables that influence the training process itself: Learning rate, hidden layers number, number of epochs, regularization, batch size are all examples of hyperparameters.
Hyperparameters tuning is made during the training job and used to be a manual and tedious process, made by running multiple trials with different values.
The time required to train and test a model can depend upon the choice of its hyperparameters.
With Vertex AI you just need to prepare a simple YAML configuration without coding.
A is wrong because Cross Validation is related to the input data organization for training, test, and validation
B is wrong because Regularization is related to feature management and overfitting
D is wrong because drift management is when data distribution changes and you have to adjust the
Which of these GCP services can you use?
A. Dialogflow
B. Document AI
C. Cloud Natural Language API
D. AutoML
Correct answers: B
Document AI is the perfect solution because it is a complete service for the automatic understanding of documents and their management.
It integrates computer natural language processing, OCR, and vision and can create pre-trained templates aimed at intelligent document administration.
You work for a large retail company. You are preparing a marketing model. The model will have to make predictions based on the historical and analytical data of the e-commerce site (analytics-360). In particular, customer loyalty and remarketing possibilities should be studied. You work on historical tabular data. You want to quickly create an optimal model, both from the point of view of the algorithm used and the tuning and life cycle of the model.
What are the two best services you can use?
A. AutoML Tables
B. BigQuery ML
C. Vertex AI
D. GKE
Correct answers: A and C
AutoML Tables can select the best model for your needs without having to experiment.
Your company operates an innovative auction site for furniture from all times. You have to create a series of ML models that allow you, starting from the photos, to establish the period, style and type of the piece of furniture depicted.
Furthermore, the model must be able to determine whether the furniture is interesting and require it to be subject to a more detailed estimate. You want Google Cloud to help you reach this ambitious goal faster.
Which of the following services do you think is the most suitable?
A. AutoML Vision Edge
B. Vision AI
C. Video AI
D. AutoML Vision
Correct Answer: D
Vision AI uses pre-trained models trained by Google. This is powerful, but not enough.
But AutoML Vision lets you train models to classify your images with your own characteristics and labels. So, you can tailor your work as you want.
You are using an AI Platform, and you are working with a series of demanding training jobs. So, you want to use TPUs instead of CPUs. You are not using Docker images or custom containers.
What is the simplest configuration to indicate if you do not have particular needs to customize in the YAML configuration file?
A. Use scale-tier to BASIC_TPU
B. Set Master-machine-type
C. Set Worker-machine-type
D. Set parameterServerType
Correct Answer: A
AI Platform lets you perform distributed training and serving with accelerators (TPUs and GPUs).
You usually must specify the number and types of machines you need for master and worker VMs. But you can also use scale tiers that are predefined cluster specifications.
In our case,
scale-tier=BASIC_TPU
covers all the given requirements.
You work for an industrial company that wants to improve its quality system. It has developed its own deep neural network model with Tensorflow to identify the semi-finished products to be discarded with images taken from the production lines in the various production phases.
You need to monitor the performance of your models and let them go faster.
Which is the best solution that you can adopt?
A. TFProfiler
B. TF function
C. TF Trace
D. TF Debugger
E. TF Checkpoint
Correct Answer: A
TensorFlow Profiler is a tool for checking the performance of your TensorFlow models and helping you to obtain an optimized version.
In TensorFlow 2, the default is eager execution. So, one-off operations are faster, but recurring ones may be slower. So, you need to optimize the model.
Your team needs to create a model for managing security in restricted areas of a campus.
Everything that happens in these areas is filmed and, instead of having a physical surveillance service, the videos must be managed by a model capable of intercepting unauthorized people and vehicles, especially at particular times.
What are the GCP services that allow you to achieve all this with minimal effort?
A. AI Infrastructure
B. Cloud Video Intelligence AI
C. AutoML Video Intelligence Classification
D. Vision AI
Correct Answer: C
AutoML Video Intelligence is a service that allows you to customize the pre-trained Video intelligence GCP system according to your specific needs.
In particular, AutoML Video Intelligence Object Tracking allows you to identify and locate particular entities of interest to you with your specific tags.
With your team you have to decide the strategy for implementing an online forecasting model in production.
This template needs to work with both a web interface as well as DialogFlow and Google Assistant and a lot of requests are expected.
You are concerned that the final system is not efficient and scalable enough, and you are looking for the simplest and most managed GCP solution.
Which of these can be the solution?
A. AI Platform Prediction
B. GKE e TensorFlow
C. VMs and Autoscaling Groups with Application LB
D. Kubeflow
Correct Answer: A
The AI Platform Prediction service is fully managed and automatically scales machine learning models in the cloud
The service supports both online prediction and batch prediction.
5% off on GCP courses| Full Training Package | Use Coupon: WHIZ25GCP |Enroll Now
Cloud Computing
Cyber Security
Big Data
Agile
Business Analysis
Linux
IT Ops
Others
Whizlabs Blog
BlogGoogle Cloud25 Free Questions – Google Cloud Certified Professional Machine Learning Engineer

Free Questions on GCPC Professional Machine Learning Engineer
25 Free Questions – Google Cloud Certified Professional Machine Learning Engineer
By
Jeevitha TP
GOOGLE CLOUD
Google Cloud Certified Professional Machine Learning Engineer Exam requires good knowledge of Google Cloud and a working understanding of proven ML models and techniques. If you are already an experienced Machine Learning Engineer, this exam may look easy to you.
Practicing with real exam questions will make you familiar with the Google ML engineer exam patterns. Whizlabs offers one of the bestpractice questionsfor this certification exam (You can also try Whizlabs free test). Below is the sample 25 questions to help you to understand the exam format and type of questions.
Google Cloud Certified Professional Machine Learning Engineer Questions
Here is the list of 25 questions for the Google Cloud Certified Professional Machine Learning Engineer Exam Questions.
Frame ML problems
Q 1. Your team works on a smart city project with wireless sensor networks and a set of gateways for transmitting sensor data. You have to cope with many design choices. You want, for each of the problems under study, to find the simplest solution.
For example, it is necessary to decide on the placement of nodes so that the result is the most economical and inclusive. An algorithm without data tagging must be used.
Which of the following choices do you think is the most suitable?
A. K-means
B. Q-learning
C. K-Nearest Neighbors
D. Support Vector Machine(SVM)
Correct answers: B
Q-learning is an RL Reinforcement Learning algorithm. RL provides a software agent that evaluates possible solutions through a progressive reward in repeated attempts. It does not need to provide labels. But it requires a lot of data and several trials and the possibility to evaluate the validity of each attempt.
The main RL algorithms are deep Q-network (DQN) and deep deterministic policy gradient (DDPG).

Types of Machine Learning
A is wrongbecause K-means is an unsupervised learning algorithm used for clustering problems. It is useful when you have to create similar groups of entities. So, even if there is no need to label data, it is not suitable for our scope.
C is wrongbecause K-NN is a supervised classification algorithm, therefore, labeled. New classifications are made by finding the closest known examples.
D is wrongbecause SVM is a supervised ML algorithm, too. K-NN distances are computed. These distances are not between data points, but with a hyper-plane, that better divides different classifications.
Q 2. Your client has an e-commerce site for commercial spare parts for cars with competitive prices. It started with the small car sector but is continually adding products. Since 80% of them operate in a B2B market, he wants to ensure that his customers are encouraged to use the new products that he gradually offers on the site quickly and profitably.
Which GCP service can be valuable in this regard and in what way?
A. Create a Tensorflow model using Matrix factorization
B. Use Recommendations AI
C. Import the Product Catalog
D. Record / Import User events
Correct answers: B
Recommendations AI is a ready-to-use service for all the requirements shown in the question. You don’t need to create models, tune, train, all that is done by the service with your data. Also, the delivery is automatically done, with high-quality recommendations via web, mobile, email. So, it can be used directly on websites during user sessions.
Recommendations AI
A could be OK, but it needs a lot of work.
C and Ddeal only with data management, not creating recommendations.
For any further detail:
Q 3. You are working on an NLP model. So, you are dealing with words and sentences, not numbers. Your problem is to categorize these words and make sense of them. Your manager told you that you have to use embeddings.
Which of the following techniques are not related to embeddings?
A. Count Vector
B. TF-IDF Vector
C. Co-Occurrence Matrix
D. CoVariance Matrix
Correct Answer: D
Covariance matrices are square matrices with the covariance between each pair of elements.
It measures how much the change of one with respect to another is related.
Covariance Matrices
All the others are embeddings:
A Count Vector gives a matrix with the count of every single word in every example. 0 if no occurrence. It is okay for small vocabularies.
TF-IDF vectorization counts words in the entire experiment, not a single example or sentence.
Co-Occurrence Matrix puts together words that occur together. So, it is more useful for text understanding.
Q 4. You are a junior Data Scientist and are working on a deep neural network model with Tensorflow to optimize the level of customer satisfaction for after-sales services with the goal of creating greater client loyalty.
You are struggling with your model (learning rates, hidden layers and nodes selection) for optimizing processing and to let it converge in the fastest way.
Which is your problem, in ML language?
A. Cross-Validation
B. Regularization
C. Hyperparameter tuning
D. drift detection management
Correct Answer: C
ML training Manages three main data categories:
Training dataalso called examples or records. It is the main input for model configuration and, in supervised learning, presents labels, that is the correct answers based on past experience. Input data is used to build the model but will not be part of the model.
Parametersare Instead the variables to be found to solve the riddle. They are part of the final model and they make the difference among similar models of the same type.
Hyperparametersare configuration variables that influence the training process itself: Learning rate, hidden layers number, number of epochs, regularization, batch size are all examples of hyperparameters.
Hyperparameters tuning is made during the training job and used to be a manual and tedious process, made by running multiple trials with different values.
The time required to train and test a model can depend upon the choice of its hyperparameters.
With Vertex AI you just need to prepare a simple YAML configuration without coding.
A is wrongbecause Cross Validation is related to the input data organization for training, test, and validation
B is wrongbecause Regularization is related to feature management and overfitting
D is wrongbecause drift management is when data distribution changes and you have to adjust the model
Architect ML solutions
Q 5. You work in a major banking institution. The Management has decided to rapidly launch a bank loan service, as the Government has created a series of “first home” facilities for the younger population.
The goal is to carry out the automatic management of the required documents (certificates, origin documents, legal information) so that the practice can be built and verified automatically using the data and documents provided by customers and can be managed in a short time and with the minimum contribution of the scarce specialized personnel.
Which of these GCP services can you use?
A. Dialogflow
B. Document AI
C. Cloud Natural Language API
D. AutoML
Correct answers: B
Document AIis the perfect solution because it is a complete service for the automatic understanding of documents and their management.
It integrates computer natural language processing, OCR, and vision and can create pre-trained templates aimed at intelligent document administration.
Dialogflow
A is wrongbecauseDialogflowis for speech Dialogs, not written documents.
C is wrongbecauseNLPis integrated into Document AI.
D is wrongbecause functions likeAutoMLare integrated into Document AI, too.
Q 6. You work for a large retail company. You are preparing a marketing model. The model will have to make predictions based on the historical and analytical data of the e-commerce site (analytics-360). In particular, customer loyalty and remarketing possibilities should be studied. You work on historical tabular data. You want to quickly create an optimal model, both from the point of view of the algorithm used and the tuning and life cycle of the model.
What are the two best services you can use?
A. AutoML Tables
B. BigQuery ML
C. Vertex AI
D. GKE
Correct answers: A and C
AutoML Tables can select the best model for your needs without having to experiment.
The architectures currently used (they are added at the same time) are:
Linear
Feedforward deep neural network
Gradient Boosted Decision Tree
AdaNet
Ensembles of various model architectures
In addition, AutoML Tables automatically performs feature engineering tasks, too, such as:
Normalization
Encoding and embeddings for categorical features.
Timestamp columns management (important in our case)
So, it has special features for time columns: for example, it can correctly split the input data into training, validation and testing.
Vertex AI is a new API that combines AutoML and AI Platform. You can use both AutoML training and custom training in the same environment.
Vertex AI
B is wrongbecause AutoML Tables has additional automated feature engineering and is integrated into Vertex AI.
D is wrongbecause GKE doesn’t supply all the ML features of Vertex AI. It is an advanced K8s managed environment.
Q 7. Your company operates an innovative auction site for furniture from all times. You have to create a series of ML models that allow you, starting from the photos, to establish the period, style and type of the piece of furniture depicted.
Furthermore, the model must be able to determine whether the furniture is interesting and require it to be subject to a more detailed estimate. You want Google Cloud to help you reach this ambitious goal faster.
Which of the following services do you think is the most suitable?
A. AutoML Vision Edge
B. Vision AI
C. Video AI
D. AutoML Vision
Correct Answer: D
Vision AIuses pre-trained models trained by Google. This is powerful, but not enough.
ButAutoML Visionlets you train models to classify your images with your own characteristics and labels. So, you can tailor your work as you want.
A is wrongbecauseAutoML Vision Edgeis for local devices.
C is wrongbecauseVideo AImanages videos, not pictures. It can extract metadata from any streaming video, get insights in a far shorter time, and let trigger events.

ML model Deployed
Q 8. You are using an AI Platform, and you are working with a series of demanding training jobs. So, you want to use TPUs instead of CPUs. You are not using Docker images or custom containers.
What is the simplest configuration to indicate if you do not have particular needs to customize in the YAML configuration file?
A. Use scale-tier to BASIC_TPU
B. Set Master-machine-type
C. Set Worker-machine-type
D. Set parameterServerType
Correct Answer: A
AI Platform lets you perform distributed training and serving with accelerators (TPUs and GPUs).
You usually must specify the number and types of machines you need for master and worker VMs. But you can also use scale tiers that are predefined cluster specifications.
In our case,
scale-tier=BASIC_TPU
covers all the given requirements.
B, C and D are wrongbecause it is not the easiest way. Moreover, workerType, parameterServerType, evaluatorType, workerCount, parameterServerCount, and evaluatorCount for jobs use custom containers and for TensorFlow jobs.
purpose.

TensorFlow
Q 9. You work for an industrial company that wants to improve its quality system. It has developed its own deep neural network model with Tensorflow to identify the semi-finished products to be discarded with images taken from the production lines in the various production phases.
You need to monitor the performance of your models and let them go faster.
Which is the best solution that you can adopt?
A. TFProfiler
B. TF function
C. TF Trace
D. TF Debugger
E. TF Checkpoint
Correct Answer: A
TensorFlow Profiler is a tool for checking the performance of your TensorFlow models and helping you to obtain an optimized version.
In TensorFlow 2, the default is eager execution. So, one-off operations are faster, but recurring ones may be slower. So, you need to optimize the model.
Tensorflow 2
B is wrongbecause the TF function is a transformation tool used to make graphs out of your programs. It helps to create performant and portable models but is not a tool for optimization.
C is wrongbecause TF tracing lets you record TensorFlow Python operations in a graph.
D is wrongbecause TF debugging is for Debugger V2 and creates a log of debug information.
E is wrongbecause Checkpoints catch the value of all parameters in a serialized SavedModel format.
Q 10. Your team needs to create a model for managing security in restricted areas of a campus.
Everything that happens in these areas is filmed and, instead of having a physical surveillance service, the videos must be managed by a model capable of intercepting unauthorized people and vehicles, especially at particular times.
What are the GCP services that allow you to achieve all this with minimal effort?
A. AI Infrastructure
B. Cloud Video Intelligence AI
C. AutoML Video Intelligence Classification
D. Vision AI
Correct Answer: C
AutoML Video Intelligence is a service that allows you to customize the pre-trained Video intelligence GCP system according to your specific needs.
In particular, AutoML Video Intelligence Object Tracking allows you to identify and locate particular entities of interest to you with your specific tags.

AutoML Video Intelligence Object Tracking
A is wrongbecause AI Infrastructure allows you to manage hardware configurations for ML systems and in particular the processors used to accelerate machine learning workloads.
B is wrongbecause Cloud Video Intelligence AI is a pre-configured and ready-to-use service, therefore not configurable for specific needs
D is wrongbecause Vision A is for images and not video.
Q 11. With your team you have to decide the strategy for implementing an online forecasting model in production.
This template needs to work with both a web interface as well as DialogFlow and Google Assistant and a lot of requests are expected.
You are concerned that the final system is not efficient and scalable enough, and you are looking for the simplest and most managed GCP solution.
Which of these can be the solution?
A. AI Platform Prediction
B. GKE e TensorFlow
C. VMs and Autoscaling Groups with Application LB
D. Kubeflow
Correct Answer: A
The AI Platform Prediction service is fully managed and automatically scales machine learning models in the cloud
The service supports both online prediction and batch prediction.
machine learning models
B and C are wrongbecause they are not managed services
D is wrongbecause Kubeflow is not a managed service, it is used into AI Platforma and let you to deploy ML systems to various environments
Design data preparation and processing systems
Q 12. You work for a digital publishing website with an excellent technical and cultural level, where you have both famous authors and unknown experts who express ideas and insights.
You, therefore, have an extremely demanding audience with strong interests that can be of various types.
Users have a small set of articles that they can read for free every month. Then they need to sign up for a paid subscription.
You have been asked to prepare an ML training model that processes user readings and article preferences. You need to predict trends and topics that users will prefer.
But when you train your DNN with Tensorflow, your input data does not fit into RAM memory.
What can you do in the simplest way?
A. Use tf.data.Dataset
B. Use a queue with tf.train.shuffle_batch
C. Use pandas.DataFrame
D. Use a NumPy array
Correct Answer: A
The tf.data.Dataset allows you to manage a set of complex elements made up of several inner components.
It is designed to create efficient input pipelines and to iterate over the data for their processing.
These iterations happen in streaming. So, they work even if the input matrix is very large and doesn’t fit in memory
You are working on a deep neural network model with Tensorflow. Your model is complex, and you work with very large datasets full of numbers.
You want to increase performances. But you cannot use further resources.
You are afraid that you are not going to deliver your project in time.
Your mentor said to you that normalization could be a solution.
Which of the following choices do you think is not for data normalization?
A. Scaling to a range
B. Feature Clipping
C. Z-test
D. log scaling
E. Z-score
Correct Answer: C
z-test is not correct because it is a statistic that is used to prove if a sample mean belongs to a specific population. For example, it is used in medical trials to prove whether a new drug is effective or not.
A is OK because Scaling to a range converts numbers into a standard range ( 0 to 1 or -1 to 1).
B is OK because Feature Clipping caps all numbers outside a certain range.
D is OK because Log Scaling uses the logarithms instead of your values to change the shape. This is possible because the log function preserves monotonicity.
E is OK because Z-score is a variation of scaling: the resulting number is divided by the standard deviations. It is aimed at obtaining distributions with mean = 0 and std = 1.
You need to develop and train a model capable of analyzing snapshots taken from a moving vehicle and detecting if obstacles arise. Your work environment is an AI Platform (currently Vertex AI).
Which technique or algorithm do you think is best to use?
A. TabNet algorithm with TensorFlow
B. A linear learner with Tensorflow Estimator API
C. XGBoost with BigQueryML
D. TensorFlow Object Detection API
Correct Answer: D
TensorFlow Object Detection API is designed to identify and localize multiple objects within an image. So it is the best solution.
You are starting to operate as a Data Scientist and are working on a deep neural network model with Tensorflow to optimize customer satisfaction for after-sales services to create greater client loyalty.
You are doing Feature Engineering, and your focus is to minimize bias and increase accuracy. Your coordinator has told you that by doing so you risk having problems. He explained to you that, in addition to the bias, you must consider another factor to be optimized. Which one?
A. Blending
B. Learning Rate
C. Feature Cross
D. Bagging
E. Variance
Correct Answer: E
The variance indicates how much function f (X) can change with a different training dataset. Obviously, different estimates will correspond to different training datasets, but a good model should reduce this gap to a minimum.
The bias-variance dilemma is an attempt to minimize both bias and variance.
The bias error is the non-estimable part of the learning algorithm. The higher it is, the more underfitting there is.
Variance is the sensitivity to differences in the training set. The higher it is, the more overfitting there is.
You have a Linear Regression model for the optimal management of supplies to a sales network based on a large number of different driving factors. You want to simplify the model to make it more efficient and faster. Your first goal is to synthesize the features without losing the information content that comes from them.
Which of these is the best technique?
A. Feature Crosses
B. Principal component analysis (PCA)
C. Embeddings
D. Functional Data Analysis
Principal component analysis is a technique to reduce the number of features by creating new variables obtained from linear combinations or mixes of the original variables, which can then replace them but retain most of the information useful for the model. In addition, the new features are all independent of each other.
The new variables are called principal components.
work for a digital publishing website with an excellent technical and cultural level, where you have both famous authors and unknown experts who express ideas and insights. You, therefore, have an extremely demanding audience with strong interests of various types. Users have a small set of articles that they can read for free every month; they need to sign up for a paid subscription.
You aim to provide your audience with pointers to articles that they will indeed find of interest to themselves.
Which of these models can be useful to you?
A. Hierarchical Clustering
B. Autoencoder and self-encoder
C. Convolutional Neural Network
D. Collaborative filtering using Matrix Factorization
Correct Answer: D
Collaborative filtering works on the idea that a user may like the same things of the people with similar profiles and preferences.
So, exploiting the choices of other users, the recommendation system makes a guess and can advise people on things not yet been rated by them.
You work for an important Banking group.
The purpose of your current project is the automatic and smart acquisition of data from documents and modules of different types.
You work on big datasets with a lot of private information that cannot be distributed and disclosed.
You are asked to replace sensitive data with specific surrogate characters.
Which of the following techniques do you think is best to use?
Correct Answer: D
Masking replaces sensitive values with a given surrogate character, like hash (#) or asterisk (*).
Format-preserving encryption (FPE) encrypts in the same format as the plaintext data.
For example, a 16-digit credit card number becomes another 16-digit number.
k-anonymity is a way to anonymize data in such a way that it is impossible to identify person-specific information. Still, you maintain all the information contained in the record.
Replacement just substitutes a sensitive element with a specified value.
Your company traditionally deals with statistical analysis on data. The services have been integrated for some years with ML models for forecasting, but analyzes and simulations of all kinds are carried out.
So you are using 2 types of tools but you have been told that it is possible to have more levels of integration between traditional statistical methodologies and those more related to AI / ML processes.
Which tool is the best one for your needs?
A. TensorFlow Hub
B. TensorFlow Probability
C. TensorFlow Enterprise
D. TensorFlow Statistics
Correct answers: B
TensorFlow Probability is a Python library for statistical analysis and probability, which can be processed on TPU and GPU, too.
TensorFlow Probability main features are:
Probability distributions and differentiable and injective (one to one) functions.
Tools for deep probabilistic models building
Inference and Simulation methods support Markov chain, Monte Carlo.
Optimizers such as Nelder-Mead, BFGS, and SGLD.
All the other answers are wrong because they don’t deal with traditional statistical methodologies.
Your customer has an online dating platform that, among other things, analyzes the degree of affinity between the various people. Obviously, it already uses ML models and uses, in particular, XGBoost, the gradient boosting decision tree algorithm, and is obtaining excellent results.
All its development processes follow CI / CD specifications and use Docker containers. The requirement is to classify users in various ways and update models frequently, based on new parameters entered into the platform by the users themselves.
So, the problem you are called to solve is how to optimize frequently re-trained operations with an optimized workflow system. Which solution among these proposals can best solve your needs?
A. Deploy the model on BigQuery ML and setup a job
B. Use Kubeflow Pipelines to design and execute your workflow
C. Use AI Platform
D. Orchestrate activities with Google Cloud Workflows
E. Develop procedures with Pub/Sub and Cloud Run
F. Schedule processes with Cloud Composer
Correct Answer: B
Kubeflow Pipelines is the ideal solution because it is a platform designed specifically for creating and deploying ML workflows based on Docker containers. So, it is the only answer that meets all requirements.
The main functions of Kubeflow Pipelines are:
Using packaged templates in Docker images in a K8s environment
Manage your various tests/experiments
Simplifying the orchestration of ML pipelines
Reuse components and pipelines
You are working with Vertex AI, the managed ML Platform in GCP.
You are dealing with custom training and you are looking and studying the job progresses during the training service lifecycle.
Which of the following states are not correct?
A. JOB_STATE_ACTIVE
B. JOB_STATE_RUNNING
C. JOB_STATE_QUEUED
D. JOB_STATE_ENDED
Correct answer: A
This is a brief description of the lifecycle of a custom training service.
Queueing a new job
When you create a CustomJob or HyperparameterTuningJob, the job is in the JOB_STATE_QUEUED.
When a training job starts, Vertex AI schedules as many workers according to configuration, in parallel.
So Vertex AI starts running code as soon as a worker becomes available.
When all the workers are available, the job state will be: JOB_STATE_RUNNING.
A training job ends successfully when its primary replica exits with exit code 0.
Therefore all the other workers will be stopped. The state will be: JOB_STATE_ENDED.
So A is wrong simply because this state doesn’t exist. All the other answers are correct.
Your team works for an international company with Google Cloud, and you develop, train and deploy several ML models with Tensorflow. You use many tools and techniques and you want to make your work leaner, faster, and more efficient.
You would like engineer-to-engineer assistance from both Google Cloud and Google’s TensorFlow teams.
How is it possible? Which service?
A. AI Platform
B. Kubeflow
C. Tensorflow Enterprise
D. TFX
Correct Answer: C
The TensorFlow Enterprise is a distribution of the open-source platform for ML, linked to specific versions of TensorFlow, tailored for enterprise customers.
It is free but only for big enterprises with a lot of services in GCP. it is prepackaged and optimized for usage with containers and VMs.
It works in Google Cloud, from VM images to managed services like GKE and Vertex AI.
The TensorFlow Enterprise library is integrated in the following products:
Deep Learning VM Images
Deep Learning Containers
Notebooks
AI Platform/Vertex AITraining
It is ready for automatic provisioning and scaling with any kind of processor.
It has a premium level of support from Google.
A is wrong because AI Platform is a managed service without the kind of support required
B and D are wrong because they are open source libraries with standard support from the community
You work for an important organization and your manager tasked you with a new classification model with lots of data drawn from the company Data Lake.
The big problem is that you don’t have the labels for all the data, but for only a subset of it and you have very little time to complete the task.
Which of the following services could help you?
A. Vertex Data Labeling
B. Mechanical Turk
C. GitLab ML
D. Tag Manager
Correct Answer: A
In supervised learning, the correctness of label data, together with the quality of all your training data is utterly important for the resulting model and the quality of the future predictions.
If you cannot have your data correctly labeled you may request to have professional people that will complete your training data.
GCP has a service for this: Vertex AI data labeling. Human labelers will prepare correct labels following your directions.
You have to set up a data labeling job with:
The dataset
A list, vocabulary of the possible labels
An instructions document for the professional people
B is wrong because Mechanical Turk is an Amazon service
C is wrong because GitLab is a DevOps lifecycle tool
D is wrong because Tag Manager is in the Google Analytics ecosystem
Your team is working with a great number of ML projects, especially with Tensorflow.
You recently prepared a DNN model for image recognition that works well and is about to be rolled out in production.
Your manager asked you to demonstrate the inner workings of the model.
It is a big problem for you because you know that it is working well but you don’t have the explainability of the model.
Which of these techniques could help you?
A. Integrated Gradient
B. LIT
C. WIT
D. PCA
Correct Answer: A
Integrated Gradient is an explainability technique for deep neural networks which gives info about what contributes to the model’s prediction.
Integrated Gradient works highlight the feature importance. It computes the gradient of the model’s prediction output regarding its input features without modification to the original model.
In the picture, you can see that it tunes the inputs and computes attributions so that it can compute the feature importances for the input image.
You can use tf.GradientTape to compute the gradients,
tf.GradientTape
tf.GradientTape
B is wrong because LIT is only for NLP models
C is wrong because What-If Tool is only for classification and regression models with structured data.
D is wrong because Principal component analysis (PCA) transforms and reduces the number of features by creating new variables, from linear combinations of the original variables.
The new features will be all independent of each other.
You work as a Data Scientist in a Startup and you work with several project with Python and Tensorflow;
You need to increase the performance of the training sessions and you already use caching and prefetching.
So now you want to use GPUs, but in a single machine, for cost reduction and experimentations.
Which of the following is the correct strategy?
A. tf.distribute.MirroredStrategy
B. tf.distribute.TPUStrategy
C. tf.distribute.MultiWorkerMirroredStrategy
D. tf.distribute.OneDeviceStrategy
Correct Answer: A
tf.distribute.Strategy is an API explicitly for training distribution among different processors and machines.
tf.distribute.MirroredStrategy lets you use multiple GPUs in a single VM, with a replica for each CPU.
tf.distribute.MirroredStrategy
tf.distribute.MirroredStrategy
B is wrong because tf.distribute.TPUStrategy let you use TPUs, not GPUs
C is wrong because tf.distribute.MultiWorkerMirroredStrategy is for multiple machines
D is wrong because tf.distribute.OneDeviceStrategy, like the default strategy, is for a single device, so a single virtual CPU.
You are building an ML model to detect anomalies in real-time sensor data. You will use Pub/Sub to handle incoming requests. You want to store the results for analytics and visualization. How should you con gure the pipeline?
A. 1 = Data ow, 2 = AI Platform, 3 = BigQuery
B. 1 = DataProc, 2 = AutoML, 3 = Cloud Bigtable
C. 1 = BigQuery, 2 = AutoML, 3 = Cloud Functions
D. 1 = BigQuery, 2 = AI Platform, 3 = Cloud Storage
C. 1 = BigQuery, 2 = AutoML, 3 = Cloud Functions
Your organization wants to make its internal shuttle service route more e cient. The shuttles currently stop at all pick-up points across the city every 30 minutes between 7 am and 10 am. The development team has already built an application on Google Kubernetes Engine that requires users to con rm their presence and shuttle station one day in advance. What approach should you take?
A. 1. Build a tree-based regression model that predicts how many passengers will be picked up at each shuttle station. 2. Dispatch an appropriately sized shuttle and provide the map with the required stops based on the prediction.
B. 1. Build a tree-based classi cation model that predicts whether the shuttle should pick up passengers at each shuttle station. 2. Dispatch an available shuttle and provide the map with the required stops based on the prediction.
C. 1. Define the optimal route as the shortest route that passes by all shuttle stations with con firmed attendance at the given time under capacity constraints. 2. Dispatch an appropriately sized shuttle and indicate the required stops on the map.
D. 1. Build a reinforcement learning model with tree-based classi cation models that predict the presence of passengers at shuttle stops as agents and a reward function around a distance-based metric. 2. Dispatch an appropriately sized shuttle and provide the map with the required stops based on the simulated outcome.
C. 1. Define the optimal route as the shortest route that passes by all shuttle stations with confirmed attendance at the given time under capacity constraints. 2. Dispatch an appropriately sized shuttle and indicate the required stops on the map.
You were asked to investigate failures of a production line component based on sensor readings. After receiving the dataset, you discover that less than 1% of the readings are positive examples representing failure incidents. You have tried to train several classi cation models, but none of them converge. How should you resolve the class imbalance problem?
A. Use the class distribution to generate 10% positive examples.
B. Use a convolutional neural network with max pooling and softmax activation.
C. Downsample the data with upweighting to create a sample with 10% positive examples.
D. Remove negative examples until the numbers of positive and negative examples are equal.
C. Downsample the data with upweighting to create a sample with 10% positive examples.
You want to rebuild your ML pipeline for structured data on Google Cloud. You are using PySpark to conduct data transformations at scale, but your pipelines are taking over 12 hours to run. To speed up development and pipeline run time, you want to use a serverless tool and SQL syntax. You have already moved your raw data into Cloud Storage. How should you build the pipeline on Google Cloud while meeting the speed and processing requirements?
A. Use Data Fusion’s GUI to build the transformation pipelines, and then write the data into BigQuery.
B. Convert your PySpark into SparkSQL queries to transform the data, and then run your pipeline on Dataproc to write the data into BigQuery.
C. Ingest your data into Cloud SQL, convert your PySpark commands into SQL queries to transform the data, and then use federated queries from BigQuery for machine learning.
D. Ingest your data into BigQuery using BigQuery Load, convert your PySpark commands into BigQuery SQL queries to transform the data, and then write the transformations to a new table.
D. Ingest your data into BigQuery using BigQuery Load, convert your PySpark commands into BigQuery SQL queries to transform the data, and then write the transformations to a new table.
You manage a team of data scientists who use a cloud-based backend system to submit training jobs. This system has become very di cult to administer, and you want to use a managed service instead. The data scientists you work with use many different frameworks, including Keras, PyTorch, theano, Scikit-learn, and custom libraries. What should you do?
A. Use the AI Platform custom containers feature to receive training jobs using any framework.
B. Con gure Kube ow to run on Google Kubernetes Engine and receive training jobs through TF Job.
C. Create a library of VM images on Compute Engine, and publish these images on a centralized repository.
D. Set up Slurm workload manager to receive jobs that can be scheduled to run on your cloud infrastructure.
A. Use the AI Platform custom containers feature to receive training jobs using any framework.
You work for an online retail company that is creating a visual search engine. You have set up an end-to-end ML pipeline on Google Cloud to classify whether an image contains your company’s product. Expecting the release of new products in the near future, you con gured a retraining functionality in the pipeline so that new data can be fed into your ML models. You also want to use AI Platform’s continuous evaluation service to ensure that the models have high accuracy on your test dataset. What should you do?
A. Keep the original test dataset unchanged even if newer products are incorporated into retraining.
B. Extend your test dataset with images of the newer products when they are introduced to retraining.
C. Replace your test dataset with images of the newer products when they are introduced to retraining.
D. Update your test dataset with images of the newer products when your evaluation metrics drop below a pre-decided threshold.
B. Extend your test dataset with images of the newer products when they are introduced to retraining.
You need to build classi cation work ows over several structured datasets currently stored in BigQuery. Because you will be performing the classi cation several times, you want to complete the following steps without writing code: exploratory data analysis, feature selection, model building, training, and hyperparameter tuning and serving. What should you do?
A. Con gure AutoML Tables to perform the classi cation task.
B. Run a BigQuery ML task to perform logistic regression for the classificiation.
C. Use AI Platform Notebooks to run the classificiation model with pandas library.
D. Use AI Platform to run the classificiation model job con gured for hyperparameter tuning.
A. Configure AutoML Tables to perform the classification task.
You work for a public transportation company and need to build a model to estimate delay times for multiple transportation routes. Predictions are served directly to users in an app in real time. Because different seasons and population increases impact the data relevance, you will retrain the model every month. You want to follow Google-recommended best practices. How should you con gure the end-to-end architecture of the predictive model?
A. Con gure Kube ow Pipelines to schedule your multi-step work ow from training to deploying your model.
B. Use a model trained and deployed on BigQuery ML, and trigger retraining with the scheduled query feature in BigQuery.
C. Write a Cloud Functions script that launches a training and deploying job on AI Platform that is triggered by Cloud Scheduler.
D. Use Cloud Composer to programmatically schedule a Data ow job that executes the work ow from training to deploying your model.
A. Congure Kubeow Pipelines to schedule your multi-step workow from training to deploying your model.
You are developing ML models with AI Platform for image segmentation on CT scans. You frequently update your model architectures based on the newest available research papers, and have to rerun training on the same dataset to benchmark their performance. You want to minimize computation costs and manual intervention while having version control for your code. What should you do?
A. Use Cloud Functions to identify changes to your code in Cloud Storage and trigger a retraining job.
B. Use the gcloud command-line tool to submit training jobs on AI Platform when you update your code.
C. Use Cloud Build linked with Cloud Source Repositories to trigger retraining when new code is pushed to the repository.
D. Create an automated work ow in Cloud Composer that runs daily and looks for changes in code in Cloud Storage using a sensor.
C. Use Cloud Build linked with Cloud Source Repositories to trigger retraining when new code is pushed to the repository.
You are designing an ML recommendation model for shoppers on your company’s ecommerce website. You will use Recommendations AI to build, test, and deploy your system. How should you develop recommendations that increase revenue while following best practices?
A. Use the €Other Products You May Like € recommendation type to increase the click-through rate.
B. Use the €Frequently Bought Together € recommendation type to increase the shopping cart size for each order.
C. Import your user events and then your product catalog to make sure you have the highest quality event stream.
D. Because it will take time to collect and record product data, use placeholder values for the product catalog to test the viability of the model.
B. Use the €Frequently Bought Together € recommendation type to increase the shopping cart size for each order.
You have trained a deep neural network model on Google Cloud. The model has low loss on the training data, but is performing worse on the validation data. You want the model to be resilient to over tting. Which strategy should you use when retraining the model?
A. Apply a dropout parameter of 0.2, and decrease the learning rate by a factor of 10.
B. Apply a L2 regularization parameter of 0.4, and decrease the learning rate by a factor of 10.
C. Run a hyperparameter tuning job on AI Platform to optimize for the L2 regularization and dropout parameters.
D. Run a hyperparameter tuning job on AI Platform to optimize for the learning rate, and increase the number of neurons by a factor of 2.
C. Run a hyperparameter tuning job on AI Platform to optimize for the L2 regularization and dropout parameters.
You built and manage a production system that is responsible for predicting sales numbers. Model accuracy is crucial, because the production model is required to keep up with market changes. Since being deployed to production, the model hasn’t changed; however the accuracy of the model has steadily deteriorated.
What issue is most likely causing the steady decline in model accuracy?
A. Poor data quality
B. Lack of model retraining
C. Too few layers in the model for capturing information
D. Incorrect data split ratio during model training, evaluation, validation, and test
B. Lack of model retraining
You have been asked to develop an input pipeline for an ML training model that processes images from disparate sources at a low latency. You discover that your input data does not t in memory. How should you create a dataset following Google-recommended best practices?
A. Create a tf.data.Dataset.prefetch transformation.
B. Convert the images to tf.Tensor objects, and then run Dataset.from_tensor_slices().
C. Convert the images to tf.Tensor objects, and then run tf.data.Dataset.from_tensors().
D. Convert the images into TFRecords, store the images in Cloud Storage, and then use the tf.data API to read the images for training.
D. Convert the images into TFRecords, store the images in Cloud Storage, and then use the tf.data API to read the images for training.
You are an ML engineer at a large grocery retailer with stores in multiple regions. You have been asked to create an inventory prediction model. Your model’s features include region, location, historical demand, and seasonal popularity. You want the algorithm to learn from new inventory data on a daily basis. Which algorithms should you use to build the model?
A. Classi cation
B. Reinforcement Learning
C. Recurrent Neural Networks (RNN)
D. Convolutional Neural Networks (CNN)
C. Recurrent Neural Networks (RNN)
ou are building a real-time prediction engine that streams les which may contain Personally Identi able Information (PII) to Google Cloud. You want to use the
Cloud Data Loss Prevention (DLP) API to scan the les. How should you ensure that the PII is not accessible by unauthorized individuals?
A. Stream all les to Google Cloud, and then write the data to BigQuery. Periodically conduct a bulk scan of the table using the DLP API.
B. Stream all les to Google Cloud, and write batches of the data to BigQuery. While the data is being written to BigQuery, conduct a bulk scan of the data using the DLP API.
C. Create two buckets of data: Sensitive and Non-sensitive. Write all data to the Non-sensitive bucket. Periodically conduct a bulk scan of that bucket using the DLP API, and move the sensitive data to the Sensitive bucket.
D. Create three buckets of data: Quarantine, Sensitive, and Non-sensitive. Write all data to the Quarantine bucket. Periodically conduct a bulk scan of that bucket using the DLP API, and move the data to either the Sensitive or Non-Sensitive bucket.
D. Create three buckets of data: Quarantine, Sensitive, and Non-sensitive. Write all data to the Quarantine bucket. Periodically conduct a bulk scan of that bucket using the DLP API, and move the data to either the Sensitive or Non-Sensitive bucket.
You have written unit tests for a Kube ow Pipeline that require custom libraries. You want to automate the execution of unit tests with each new push to your development branch in Cloud Source Repositories. What should you do?
A. Write a script that sequentially performs the push to your development branch and executes the unit tests on Cloud Run.
B. Using Cloud Build, set an automated trigger to execute the unit tests when changes are pushed to your development branch.
C. Set up a Cloud Logging sink to a Pub/Sub topic that captures interactions with Cloud Source Repositories. Con gure a Pub/Sub trigger for Cloud Run, and execute the unit tests on Cloud Run.
D. Set up a Cloud Logging sink to a Pub/Sub topic that captures interactions with Cloud Source Repositories. Execute the unit tests using a Cloud Function that is triggered when messages are sent to the Pub/Sub topic.
B. Using Cloud Build, set an automated trigger to execute the unit tests when changes are pushed to your development branch.
You are an ML engineer at a global shoe store. You manage the ML models for the company’s website. You are asked to build a model that will recommend new products to the user based on their purchase behavior and similarity with other users. What should you do?
A. Build a classi cation model
B. Build a knowledge-based ltering model
C. Build a collaborative-based ltering model
D. Build a regression model using the features as predictors
C. Build a collaborative-based ltering model
You work for a social media company. You need to detect whether posted images contain cars. Each training example is a member of exactly one class. You have trained an object detection neural network and deployed the model version to AI Platform Prediction for evaluation. Before deployment, you created an evaluation job and attached it to the AI Platform Prediction model version. You notice that the precision is lower than your business requirements allow. How should you adjust the model’s nal layer softmax threshold to increase precision?
A. Increase the recall.
B. Decrease the recall.
C. Increase the number of false positives. D. Decrease the number of false negatives.
B. Decrease the recall.
You are responsible for building a uni ed analytics environment across a variety of on-premises data marts. Your company is experiencing data quality and security challenges when integrating data across the servers, caused by the use of a wide range of disconnected tools and temporary solutions. You need a fully managed, cloud-native data integration service that will lower the total cost of work and reduce repetitive work. Some members on your team prefer a codeless interface for building Extract, Transform, Load (ETL) process. Which service should you use?
A. Data ow
B. Dataprep
C. Apache Flink
D. Cloud Data Fusion
D. Cloud Data Fusion
You have trained a model on a dataset that required computationally expensive preprocessing operations. You need to execute the same preprocessing at prediction time. You deployed the model on AI Platform for high-throughput online prediction. Which architecture should you use?
A. Validate the accuracy of the model that you trained on preprocessed data. Create a new model that uses the raw data and is available in real time. Deploy the new model onto AI Platform for online prediction.
B. Send incoming prediction requests to a Pub/Sub topic. Transform the incoming data using a Data ow job. Submit a prediction request to AI Platform using the transformed data. Write the predictions to an outbound Pub/Sub queue.
C. Stream incoming prediction request data into Cloud Spanner. Create a view to abstract your preprocessing logic. Query the view every second for new records. Submit a prediction request to AI Platform using the transformed data. Write the predictions to an outbound Pub/Sub queue.
D. Send incoming prediction requests to a Pub/Sub topic. Set up a Cloud Function that is triggered when messages are published to the Pub/Sub topic. Implement your preprocessing logic in the Cloud Function. Submit a prediction request to AI Platform using the transformed data. Write the predictions to an outbound Pub/Sub queue.
B. Send incoming prediction requests to a Pub/Sub topic. Transform the incoming data using a Dataow job. Submit a prediction request to AI Platform using the transformed data. Write the predictions to an outbound Pub/Sub queue.
Your team trained and tested a DNN regression model with good results. Six months after deployment, the model is performing poorly due to a change in the distribution of the input data. How should you address the input differences in production?
A. Create alerts to monitor for skew, and retrain the model.
B. Perform feature selection on the model, and retrain the model with fewer features.
C. Retrain the model, and select an L2 regularization parameter with a hyperparameter tuning service.
D. Perform feature selection on the model, and retrain the model on a monthly basis with fewer features.
A. Create alerts to monitor for skew, and retrain the model.
You are developing a Kube ow pipeline on Google Kubernetes Engine. The first step in the pipeline is to issue a query against BigQuery. You plan to use the results of that query as the input to the next step in your pipeline. You want to achieve this in the easiest way possible. What should you do?
A. Use the BigQuery console to execute your query, and then save the query results into a new BigQuery table.
B. Write a Python script that uses the BigQuery API to execute queries against BigQuery. Execute this script as the rst step in your Kube ow pipeline.
C. Use the Kube ow Pipelines domain-speci c language to create a custom component that uses the Python BigQuery client library to execute queries.
D. Locate the Kube ow Pipelines repository on GitHub. Find the BigQuery Query Component, copy that component’s URL, and use it to load the component into your pipeline. Use the component to execute queries against BigQuery.
D. Locate the Kubeow Pipelines repository on GitHub. Find the BigQuery Query Component, copy that component’s URL, and use it to load the component into your pipeline. Use the component to execute queries against BigQuery.
You are building a model to predict daily temperatures. You split the data randomly and then transformed the training and test datasets. Temperature data for model training is uploaded hourly. During testing, your model performed with 97% accuracy; however, after deploying to production, the model’s accuracy dropped to 66%. How can you make your production model more accurate?
A. Normalize the data for the training, and test datasets as two separate steps.
B. Split the training and test data based on time rather than a random split to avoid leakage.
C. Add more data to your test set to ensure that you have a fair distribution and sample for testing.
D. Apply data transformations before splitting, and cross-validate to make sure that the transformations are applied to both the training and test sets.
B. Split the training and test data based on time rather than a random split to avoid leakage.
You are developing models to classify customer support emails. You created models with TensorFlow Estimators using small datasets on your on- premises system, but you now need to train the models using large datasets to ensure high performance. You will port your models to Google Cloud and want to minimize code refactoring and infrastructure overhead for easier migration from on-prem to cloud. What should you do?
A. Use AI Platform for distributed training.
B. Create a cluster on Dataproc for training.
C. Create a Managed Instance Group with autoscaling.
A. Use AI Platform for distributed training.
D. Use Kube ow Pipelines to train on a Google Kubernetes Engine cluster.
A. Use AI Platform for distributed training.
You have trained a text classi cation model in TensorFlow using AI Platform. You want to use the trained model for batch predictions on text data stored in
BigQuery while minimizing computational overhead. What should you do?
A. Export the model to BigQuery ML.
B. Deploy and version the model on AI Platform.
C. Use Data ow with the SavedModel to read the data from BigQuery.
D. Submit a batch prediction job on AI Platform that points to the model location in Cloud Storage.
A. Export the model to BigQuery ML.
You work with a data engineering team that has developed a pipeline to clean your dataset and save it in a Cloud Storage bucket. You have created an ML model and want to use the data to refresh your model as soon as new data is available. As part of your CI/CD work ow, you want to automatically run a Kubef ow
Pipelines training job on Google Kubernetes Engine (GKE). How should you architect this work ow?
A. Con figure your pipeline with Data ow, which saves the les in Cloud Storage. After the file is saved, start the training job on a GKE cluster.
B. Use App Engine to create a lightweight python client that continuously polls Cloud Storage for new les. As soon as a fi le arrives, initiate the training job.
C. Con figure a Cloud Storage trigger to send a figure to a Pub/Sub topic when a new fi le is available in a storage bucket. Use a Pub/Sub- triggered Cloud Function to start the training job on a GKE cluster.
D. Use Cloud Scheduler to schedule jobs at a regular interval. For the fi rst step of the job, check the timestamp of objects in your Cloud Storage bucket. If there are no new les since the last run, abort the job.
C. Configure a Cloud Storage trigger to send a figure to a Pub/Sub topic when a new file is available in a storage bucket. Use a Pub/Sub- triggered Cloud Function to start the training job on a GKE cluster.
D. Use Cloud Scheduler to schedule jobs at a regular interval. For the first step of the job, check the timestamp of objects in your Cloud Storage bucket. If there are no new les since the last run, abort the job.
You have a functioning end-to-end ML pipeline that involves tuning the hyperparameters of your ML model using AI Platform, and then using the best-tuned parameters for training. Hypertuning is taking longer than expected and is delaying the downstream processes. You want to speed up the tuning job without signi cantly compromising its effectiveness. Which actions should you take? (Choose two.)
A. Decrease the number of parallel trials.
B. Decrease the range of floating-point values.
C. Set the early stopping parameter to TRUE.
D. Change the search algorithm from Bayesian search to random search.
E. Decrease the maximum number of trials during subsequent training phases.
CE