W4-Machine Learning Modeling Pipelines in Production Flashcards
At a high level, there are two main ways to analyze the performance of your model. ____ and ____.
Black-box evaluation,
Model introspection
What’s the difference between black-box evaluation and model introspection for performance analysis
In black-box evaluation, you generally don’t consider the internal structure of the model. You’re just interested in quantifying the performance of the model through metrics and losses. This is often sufficient within the normal course of development. But if you’re interested in how the model is working internally, maybe looking for ways to improve it. You can apply various model introspection methods.
____ is an example of a tool for black-box evaluation.
TensorBoard
Top-level metrics can hide problems with specific parts of the data. True/False
True
What does TFMA do?
TFMA is an open-source, scalable framework that includes built-in capabilities to
* check model quality standards,
* visualize evaluation metrics,
* and inspect performance on different slices of data.
What’s the difference between TensorBoard and TFMA?
TensorBoard is used for analyzing the training process and TFMA used for deep analysis of the finished trained model.
To use TFMA, you don’t need to train your model and generate a saved model object. True/False
False, you do need to train the model and save it first.
If you train your model in a TFX pipeline, then the pre-processing and feature engineering done during training will already be included in your saved model. However, if you did not train in TFX, you will need to apply the same pre-processing manually.
you need to create an eval_config object that encapsulates the requirements for TFMA. This includes defining the ____ and the ____.
slices of your dataset,
metrics that you want to use for analyzing your model
What’s model robustness?
Model robustness refers to the ability of a model to consistently provide accurate results even when the features change drastically.
Robustness cannot be measured during training, and a separate validation or test set should be used for measuring robustness. True/False
True
What’s sensitivity analysis?
Sensitivity analysis involves examining the impact of individual features on a machine learning model’s prediction by changing a single feature’s value while holding the others constant.
What are adversarial examples?
Adversarial examples are inputs created to confuse a neural network resulting in misclassification. (to simulate an adversarial attack)
The attack involves adding noise to an image that is indistinguishable to the human eye but causes the model to misclassify the image.
What are some ways to improve model robustness?
Ways to improve model robustness:
* Ensure training data accurately mirrors requests that model receives
* Use data augmentation techniques to help model generalize and correct for unbalanced data
* Understand inner workings of model to improve interpretability and robustness
* Use model editing to tweak model for better performance and robustness
* Apply model assertions to ensure results meet business rules and sanity checks
What’s model remediation?
Model remediation refers to techniques used to improve the robustness, performance, and fairness of machine learning models. This includes methods such as data augmentation, model interpretation, model editing, model assertions, bias reduction techniques, and constant monitoring of models to ensure they remain accurate, fair, and secure. The goal of model remediation is to make sure that machine learning models are reliable and useful in real-world applications, and that they don’t perpetuate or exacerbate biases and discrimination.
It’s important to identify slices of data that are sensitive to fairness and measure model performance on those slices to avoid hiding fairness problems with particular groups of people. True/False
True
Conduct various fairness tests on all available slices of data, evaluate fairness metrics across multiple thresholds, and consider reporting the rate at which the label is predicted for predictions without a good margin of separation from decision boundaries.