MLE Flashcards
What are the two types of Quantization in TFX?
Post Training Quantization and Quantization Aware Training.
What is Post Training Quantization?
Post-training quantization is a conversion technique that can reduce model size while also improving CPU and hardware accelerator latency, with little degradation in model accuracy. You can quantize an already-trained float TensorFlow model when you convert it to TensorFlow Lite format using the TensorFlow Lite Converter. It reduces the model size by integer casting, reduced float precision or using a dynamic range quantization.
What is Quantization Aware Training?
Training that reduces the space of a model by reducing the bit precision of training weights in a neural network. It is more accurate than post training quantization but it is harder to use and requires full retraining.
What are the four available options for model training on Vertex AI?
1.) AutoML
2.) Custom Training
3.) Model Garden
4.) Generative AI
What is AutoML? When is it recommended?
AutoML is a no code solution for training tabular, image, text, or video data without preparing data splits. It is best for:
- Automatically tuning a model with some input data.
- Teams that have little to know coding experience
- Teams that want to quickly get a model running.
-Teams that do not want control of hyperparameter tuning aside from early stopping.
- Teams that are solving a problem in the defined problem types offered.
- Models served on an edge device or on google cloud.
- Models with latency greater than 100ms.
-
What is BigQueryML? When is it recommended?
BigQueryML is Google’s built in ML commands to create ML directly in BQ. It is recommended for:
- Those comfortable in SQL.
- Those with data already in BQ
- Those whose problems are covered by BQ’s model set.
What is custom training? When is it recommended?
Custom training is complete freedom to optimize all aspects of an ML pipeline. It is recommended for:
- Problems outside the scope of BQML and AutoML.
- Problems that are already written in code from another premesis.
What are the three custom training methods on Vertex AI?
1.) Custom Jobs
2.) Hyper-parameter tuning jobs
3.) Training pipelines
What are custom training custom jobs?
A basic way to run a custom machine learning model on Vertex AI. It needs a pre-built or custom container to run in.
What are custom training hyperparameter tuning jobs?
This runs multiple trials of custom jobs to tune hyperparameters. It requires, a metric to evaluate performance against, a maximum number of trials to perform, a maximum number of parallel trials, the maximum number of jobs that can fail, the machine type and any accelerators (GPUs/TPUs it uses), the custom or pre-built container information it is using.
What is a custom training, training pipeline?
A training pipeline can run a custom job or hyperparameter tuning job and outputs your model to a google cloud storage bucket.
What frameworks have pre-built containers for training?
TensorFlow, XGBoost, Scikit-Learn, Pytorch
What is model garden?
Model Garden in the Google Cloud console is an ML model library that helps you discover, test, customize, and deploy Google proprietary and select OSS models and assets. Many of these are pretriained and allow fine tuning/ transfer learning to customize to a nearby solution.
What is fine tuning and transfer learning? When should one be used over the other?
Transfer learning is the process of retraining the final layers of a pre-trained model. Fine tuning is an extension of transfer learning that allows retraining of the weights of a model as well. Fine tuning is recommended for larger datasets while transfer learning is for smaller ones as it is more likely to overfit.
What is the AutoML workflow?
1.) Prepare your training data.
2.) Create a dataset.
3.) Train a model.
4.) Evaluate and iterate on your model.
5.) Get predictions from your model.
6.) Interpret prediction results.
What are best practices for ML Environment Setup in custom training?
1.) Use Vertex AI Workbench notebooks for experimentation and development.
2.) Create notebook instances for each team member.
3.) Store ML resources the same just like datasets with IAM permisisons.
4.) Use Vertex AI SDK for Python.
What are best practices for ML Development in custom training?
1.) Store structured and semi-structured data in BigQuery.
2.) Store image, video, audio and unstructured data on Cloud Storage.
3.) Use Vertex AI Data Labeling for unstructured data.
4.) Use Vertex AI Feature Store with structured data.
5.) Avoid storing data in block storage.
6.) Use Vertex AI TensorBoard and Vertex AI Experiments for analyzing experiments.
7.) Train a model within a notebook instance for small datasets.
8.) Maximize your model’s predictive accuracy with hyperparameter tuning.
9.) Use feature attributions (importances) to gain insights into model predictions.
What are best practices for Data Processing in custom training?
- ) Use BigQuery to process structured and semi-structured data or if data is in BQ already.
2.) Use Dataflow to process data.
3.) Use Dataproc for serverless Spark data processing.
What are best practices for operationalized training in custom training?
1.) Run code in a managed service like Vertex AI training (container based solutions with task.py file) or Vertex AI pipelines.
2.) Operationalize job execution with training pipelines.
3.) Use training checkpoints to save the current state of your experiment.
4.) Prepare model artifacts for serving in Cloud Storage.
5.) Regularly compute new feature values and push them to feature store.
What is operationalized training?
Operationalized training refers to the process of making model training repeatable, tracking repetitions, and managing performance.
What is Dataproc?
A managed Apache Spark/Hadoop service that allows batch processing, querying, streaming and ML.
What is Dataflow?
Data flow is a serverless service built on Apache Beam for setting up automated data processing pipelines. It can be used with TFX and Kubeflow Pipelines as they have integrated DataFlow runners. Since Vertex AI Pipelines support both, it can also be used there.
What is Vertex AI TensorBoard?
A tool for measuring and visualizing aspects of a TF ML workflow.
What is a Vertex AI Managed Dataset?
Vertex AI offers a central repo for datasets which can be used for AutoML and custom models on Vertex AI. It accepts Image, Tabular, Text and Video data
What file formats are should be used for model artifacts from a Vertex AI pre-built container?
1.) TensorFlow: saved_model.pb
2.) Scikit-Learn: model.joblib or model.pkl
3.) XGBoost: model.bst
4.) PytTorch: model.pth
What are best practices for model deployment and serving in custom training?
1.) Specify the number and type of machines you need?
2.) Plan inputs to the model using batch or online serving techniques.
3.) Turn on auto-scaling by defining the minimum and maximum nodes with a bare minimum of 2 nodes.
What is batch prediction?
Prediction of batches of data brought in at a regular interval. Requests are asynchronous and come directly from the model. Requires an input source and output location of either GCS or BigQuery. Can also be done by reading batch features with the Feature Store API but this would be slower as features would need to be ingested.
What are ML workflow orchestration best practices?
1.) Use Vertex AI pipelines for running DAGS created by Kubeflow, TFX and Airflow.
2.) Use Kubeflow pipelines to author your pipelines.
What are recommended Artifact Organization best practices?
1.) Organize ML artifacts.
2.) Use version control for pipeline and custom component code.
What artifacts should be stored in the source control repo?
- Vertex AI Workbench notebooks
- Pipeline source code
- Preprocessing functions
- Model source code
- Model training packages
- Serving functions
What is an Artifact?
An artifact is the output resulting from each step of a ML workflow.
What artifacts should be stored in Experiments and ML Metadata?
- Experiments
- Hyperparameters
- Metaparameters
- Metrics
- Data Artifacts
- Model Artifacts
- Pipeline Metadata
What artifacts should be stored in Vertex AI Model Registry?
Trained Models from AutoML, Custom training or BigQueryML. They can be versioned
What artifacts should be stored in the Artifact Registry?
- Pipeline containers
- Custom training environments
- Custom prediction environments
What artifacts should be stored in Vertex AI Prediction?
Deployed models
What are best practices for model monitoring?
1.) Use drift and skew detection at an endpoint. It uses TFDV under the hood to determine data drift and skew.
2.) Fine tune alert thresholds.
3.) Use feature attributions as an early warning sign for data drift or skew through Vertex Explainable AI.
What is data skew?
The degree of distortion between your training data and production data?
What is data drift?
The process at which data drifts over time changing the underlying statistical distribution of inputs and target.
What is an online prediction?
A synchronous request made to a model endpoint for serving predictions with low latency and/or streaming data.
What are the guidelines for experimentation?
1.) Have fixed thresholds for optimizing metrics and satisficing metrics like latency and model size.
2.) Implement an evaluation routine that is model indifferent.
3.) Ensure you have a baseline model to compare against.
4.) Track every experiment and incremental improvement.
What are the guidelines for data quality?
1.) Address class imbalance early.
2.) Automate data preprocessing.
3.) Prevent data leakage with a test-train split that isolates test data from the tuning process.
4.) Generate a data schema that includes feature statistics.
5.) Ensure training data is properly shuffled in batches.
6.) Use a validation set for model/hyperparameter tuning.
What are the guidelines for model quality?
1.) For DNN’s, monitor for NaN values in loss and percentage of weights as this can indicate errors or vanishing/exploding gradients.
2.) Use validation and test data to check for overfitting/underfitting
3.) Analyze misclassified instances to check for mislabeling, outliers or pre-processing that is needed.
4.) Analyze feature importance and remove those that have little importance.
What are the guidelines for data validation?
1.) Verify features match the expected schema.
2.) Verify data is in expected ranges and distributions.
3.) Validate the maximum fraction of missing values.
What are the guidelines for model validation?
1.) Validate the model on unseen test data.
2.) Ensure the test data is representative of the data and that time series test data is fresher than train.
What are guidelines for model deployment?
1.) Verify the model can be called.
2.) Validate satisficing requirements.
3.) Unit test the model for edge cases and typical cases.
4.) Test in a staging environment where you can roll back to a previous version if needed.
5.) Use A/B or multi-armed bandit testing before fully rolling out a new model.
What are guidelines for model serving?
1.) Regularly profile request data for tracking data drift or skew and set alerts for skew/drift thresholds.
2.) Identify concept drift by checking how feature importance changes over time.
3.) Determine outliers with respect to the training data.
4.) Perform continuous evaluation where true labels are available.
5.) Monitor service efficiency.
6.) Monitor predictive performance.
What are the 3 continuous parts of MLOps?
1.) Integration (CI): Testing and validating code, components, data, schmas and models.
2.) Delivery (CD): Deployment of an end-to-end pipeline that automatically pushes to a prediction service.
3.) Training (CT): Models are automatically retrained and served as they improve.
What is MLOps maturity level 0?
Completely manual process that is script driven and experimental. It has no CI, CD or CT.
What is MLOps maturity level 1?
A step up from level 0 with CT integration. It needs automated data and model validation.
What is MLOps maturity level 2?
It includes integration of CI/CD on top of CT for rapid automation of pipeline experimentation and integration. It requires source control, test/build services, deployment services, model registry, feature store, ml metadata store and ml pipeline orchestration which are all automated.
What is Vertex AI Pipelines?
Vertex AI pipelines is a managed resource for MLOps. It supports both KubeFlow Pipelines and Tensor Flow Extended frameworks but it manages the compute cluster for you in a containerized environment.
When should TFX be used?
When the pipeline you are creating is running tensorflow code.
When should Kubeflow Pipelines be used?
When not using a tensorflow code or off premises/multicloud solutions are needed.
How can BigQueryML serve models?
BQML can natively serve batch predictions but by integrating into Vertex AI it can be deployed to an endpoint through Model Registry and perform online prediction. This deployment does not work for ARIMA+ or XGBoost models.
What is Vertex AI Vizier?
A black-box optimization service that helps tune hyperparameters. Done when the objective/loss function function is unknown or too costly to use. By default it uses Bayesian Optimization but can also use Grid Search, Random Search or an unspecified mode that chooses a best solution.
What is Neural Architect Search?
The process by which AutoML searches for and finds the best model architecture for a given problem and tunes its parameters.