MLOps Question Bank Flashcards
(45 cards)
How many row-level predictions are saved each day?
Up to 2.4M. We save the first 100k scored records per hour
How does DataRobot calculate data drift?
The data drift plot defaults to population stability index (PSI) scores. You can also get KL divergence, Hellinger divergence, Jensen–Shannon divergence, and Histogram intersection scores through the Python client
What types of features do we calculate data drift for?
The Feature Drift vs. Feature Importance chart monitors the 10 most impactful numerical or categorical features in your data. This chart excludes any text, percentage, or currency features, which means that you can have less than 10 features plotted
How do you change the cutoff for accuracy drift?
You can change the cutoff in the Monitoring tab of the Settings section inside a deployment.
How does the AssociationID work? What happens when you upload an actual with a duplicate AssociationID?
The actuals payload must contain the column names associationId and actualValue. Use the optional column wasActedOn to indicate if the prediction was acted on in a way that could have affected the actual outcome. If you submit multiple actuals with the same association ID value, either in the same or a subsequent request, DataRobot uses the latest actuals value
At what granularity can we see drift and service health? Hourly, daily, weekly, monthly?
At what granularity can we see drift and service health? Hourly, daily, weekly, monthly?
What roles are associated with deployments? How are they different than roles associated with projects?
User/Admin/Consumer? Need to look up.
Do we support any language?
We support models built in most languages. Two notable exceptions in 5.2 are SAS and DataRobot models (however, you can use Agents to monitor codegen).
Do we support any modeling type (i.e. regression, classification, multi-class)?
We currently only support regression and binary classification. We plan to relex these limitations in future releases.
What is the maximum number of environments we can support?
10
Can you adjust the threshold for drift?
Yes - MLOps Agent.The MLOps agent can report predictions data and metrics for all external deployments, whether fully connected, intermittently connected, or completely disconnected from MLOps. If you have deployments running in isolated environments and disconnected from the network, for example, you can provide their data to the agent, and then view and manage them from MLOps.
What is the lowest resolution that you can display Data Drift and Accuracy?
Daily
Do we offer tracking of externally deployed models
Yes - MLOps Agent.The MLOps agent can report predictions data and metrics for all external deployments, whether fully connected, intermittently connected, or completely disconnected from MLOps. If you have deployments running in isolated environments and disconnected from the network, for example, you can provide their data to the agent, and then view and manage them from MLOps.
What are the levels of Role Based Access Control (or User Management) provided by MLOps?
Deployment Admin, Owner, User, Consumer
How do you change the prediction threshold in MLOps and how does this differ in MMM for AutoML/AutoTS
You currently cannot change the prediction threshold in AutoML / AutoTS for DataRobot models, but you can change it for Custom Models inside MLOps
What access does the Deployment Admin user offer users?
A deployment administrator role, assigned by the system administrator, has User role permissions for all existing and newly created deployments within their organization; the deployment admin is also able to approve new deployments facilitating governnance
Does MLOps offer governance capabilities? Explain.
Role Based Access Control (User Management), Deployment Admin, Workflow Approvals, Materiality Score (Importantance), Real Time Monitoring
What is the alert notification structure for MLOps, what are you able to customize?
Email. And you can customize when alerts are scheduled, who receives them, as well at thresholds for accruacy
How does DataRobot calculate accuracy drift?
How does DataRobot calculate accuracy drift?
What are some components of service health that MLOps will monitor?
Number of predictions,number of requests, execution time, response time, predictions over certain time, data error, system error, consumers, cache hit rate
What is Association ID and how is it used by MLOps?
Association ID is unique identifier for each prediction request. It can be optional or mandatiroy field passed with a prediction request. Once actuals are available they are sent back to DR along with association ID so that DR can tie back the actuals and predictions
What are some reasons for Red/Failing model health (service, data, or accuracy)?
At least one 5xx error, At least one higher-importance attribute’s distribution has shifted since the model was deployed, Accuracy has severely declined since the model was deployed
Does MLOps support external deployments?
Two components: 1. MLOps Agent 2. Adding External Deployment. You can use the deployment management tool to analyze historical predictions and continuously assess model quality based on prediction output. Note that while not all the tools of model management are available to externally imported datasets, the inventory, data drift, and accuracy tabs provide an excellent starting point.
What data from training is used as a baseline to calculate Model Health?
uses the holdout data distribution. Where model is trained on holdout it goes back to the same BP on non holdout data and uses that for baseline. For Custom Models, baseline is not currently provided but it is on the roadmap.