Monitoring ML solutions Flashcards

1
Q

Why is privacy a critical consideration in AI, and how does it relate to Google’s AI principles?

A

Privacy is integral to ethical AI design because:

It adheres to legal and regulatory standards.

Aligns with social norms and individual expectations.

Safeguards sensitive information.

Privacy is a cornerstone of Google’s fifth AI principle: Incorporate privacy design principles, ensuring AI systems respect user data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are sensitive attributes, and how do they impact AI system design?

A

Sensitive attributes include personally identifiable information (PII) and other critical data, such as:

PII: Names, addresses, SSNs.
Social Data: Ethnicity, religion.
Health Data: Diagnoses, genetic information.
Financial Data: Credit card details, income.
Biometric Data: Fingerprints, facial recognition.
AI systems must handle sensitive data with heightened security and legal compliance, as misuse can result in privacy violations and user mistrust.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are common de-identification techniques in AI, and their benefits and drawbacks?

A

Redaction: Deletes sensitive data; irreversible but may reduce model utility.

Replacement: Substitutes values; irreversible, can impact learning.

Masking: Hides parts of data; retains structure but not the original value.

Tokenization: Maps data to unique tokens; reversible, vulnerable to attacks.

Bucketing: Groups numeric data into ranges; reduces granularity.

Shifting: Randomizes timestamps; preserves sequence but is reversible.

Each technique balances privacy and utility based on context.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Explain k-anonymity and l-diversity. How do they enhance privacy?

A

k-Anonymity: Ensures each record is indistinguishable from at least k-1 others, reducing re-identification risks.

l-Diversity: Ensures that each anonymized group has l distinct sensitive values, addressing homogeneity in k-anonymized data.

These methods collectively enhance privacy while maintaining data utility.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How does differential privacy protect individual data during analysis?

A

Differential privacy ensures that the inclusion or exclusion of any individual’s data minimally affects the analysis outcome by:

Adding calibrated noise.

Preventing sensitive attribute identification.

Providing strong, mathematically proven privacy guarantees through parameters like epsilon (privacy strength).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the trade-offs involved in setting epsilon for differential privacy?

A

Lower Epsilon: Stronger privacy, but higher noise can degrade data utility.

Higher Epsilon: Less privacy, but better model accuracy.

Selecting epsilon involves balancing privacy with analytical and model performance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is DP-SGD, and how does it enhance model training security?

A

Differentially Private Stochastic Gradient Descent (DP-SGD) integrates differential privacy into SGD by:

Gradient Clipping: Limits the influence of individual samples.

Noise Addition: Protects data during updates. This method is easily implemented using libraries like TensorFlow Privacy.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Describe federated learning and its advantages for privacy.

A

Federated learning trains models locally on user devices, sharing only gradients with central servers:

Preserves data privacy by avoiding raw data transfer.
Supports personalization, e.g., Gboard predictions.
Updates central models without exposing sensitive user inputs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are key privacy challenges in federated learning?

A

Membership Inference Attacks:
Revealing if specific data points were used.

Sensitive Property Breaches: Exposing private attributes.

Model Poisoning: Malicious users manipulate training data to degrade models.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How does secure aggregation enhance privacy in federated learning?

A

Secure aggregation encrypts user gradients before sharing with central servers:

Ensures gradients are only decrypted after aggregation.
Protects individual data contributions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How does Google Cloud prevent training data extraction attacks in generative AI?

A

Google Cloud:

Excludes customer data from training foundation models.

Encrypts data at rest and in transit.

Ensures generated content cannot reveal specific training data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the risks of training data extraction attacks, and how do they occur?

A

Risks:

Revealing sensitive information (e.g., addresses).

Violating user privacy.

These occur through iterative prompt crafting to extract memorized training examples from generative models.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How does Google ensure privacy compliance in its AI/ML systems?

A

Privacy by Default: No customer data in foundation models.

Encryption: TLS in transit, Customer-Managed Encryption Keys (CMEK).

Access Control: IAM for minimal privilege.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How does the Cloud Data Loss Prevention API support sensitive data protection?

A

The API:

Detects PII in structured/unstructured data.

Applies de-identification techniques like masking and tokenization.

Monitors re-identification risks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Why is encryption critical for AI systems, and how does Google implement it?

A

Encryption ensures data security:

Default Encryption: For data at rest and in transit.

Cloud KMS: Centralized management of cryptographic keys.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What rules does IAM enforce to ensure secure access control in Google Cloud?

A

IAM enforces:

Least-privilege access.
Fine-grained roles for resources.
Audit trails to monitor actions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is differential privacy’s role in federated learning?

A

It prevents gradient leaks by:

Adding noise to gradients before aggregation.

Ensuring individual updates cannot be inferred.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What are the security concerns specific to generative AI models?

A

Memorization of sensitive data.

Output leakage via prompts.

Vulnerability to adversarial prompts.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How does Google secure generative AI inference pipelines?

A

Encrypts inputs and outputs in transit.
Stores tuned weights securely.
Provides CMEK for customer-managed encryption.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Summarize the privacy principles applied in AI/ML by Google.

A

Data Minimization: Collect only necessary data.

Transparency: Document usage and policies.

Security: Encrypt, monitor, and audit all interactions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is the relationship between AI safety and Google’s AI principles?

A

AI safety is grounded in Google’s AI principles, specifically:

Principle 3: “Be built and tested for safety,” emphasizing robust testing to minimize risks.

Principle 2: Avoid creating or reinforcing unfair bias.

Principle 6: Ensure accountability to people, promoting transparency and oversight.

AI safety overlaps with fairness and accountability, ensuring ethical use.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What makes safety more challenging in generative AI compared to discriminative AI models?

A

Unknown Output Space: Generative AI can produce unexpected and creative outputs, making prediction difficult.

Diverse Training Data: Models trained on large datasets might generate outputs significantly different from the input data.

Adversarial Inputs: Generative AI is more prone to malicious prompt exploitation.

Unlike discriminative models (e.g., classifiers), generative models require extensive safeguards to manage risks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What are the two primary approaches to AI safety?

A

Technical Approach: Implements engineering solutions, such as model safeguards, input-output filters, and adversarial testing.

Institutional Approach (AI Governance): Focuses on industry-wide policies, national regulations, and ethical guidelines to govern AI use.

Both approaches complement each other.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What are input and output safeguards in generative AI systems?

A

Input Safeguards: Block or rewrite harmful prompts before processing.

Output Safeguards: Detect and mitigate unsafe outputs using classifiers, error messages, or response ranking based on safety scores.

These safeguards ensure compliance with safety standards.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Explain adversarial testing and its significance in AI safety.

A

Adversarial testing evaluates how an AI system responds to malicious or harmful inputs by:

Creating test datasets with edge cases and adversarial examples.

Running model inference on the dataset to identify failures.

Annotating and analyzing outputs for policy violations.
It guides model improvements and informs product launch decisions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Differentiate between malicious and inadvertently harmful inputs.

A

Malicious Inputs: Explicitly designed to elicit harmful responses (e.g., asking for hate speech).

Inadvertently Harmful Inputs: Benign inputs that result in harmful outputs due to biases or context sensitivity (e.g., stereotypes in descriptions).

Both require mitigation through testing and safeguards.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What are some common ways a generative AI can fail to meet guidelines?

A

Generating harmful content (e.g., hate speech).

Revealing PII or SPII.

Producing biased or unethical outputs.

Misaligning with user contexts.

Avoiding these requires robust safety frameworks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

How can safety classifiers mitigate harmful content in generative AI?

A

Safety classifiers evaluate inputs and outputs based on predefined harm categories (e.g., hate speech, explicit content) and suggest actions:

Block harmful inputs.
Rewrite risky prompts.
Rank outputs by safety scores.
Examples: Google’s Perspective API and OpenAI’s Moderation API.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What is the role of human oversight in AI safety workflows?

A

Validate classifier predictions.

Annotate complex or subjective outputs (e.g., hate speech).

Correct errors in automated processes.

Human-in-the-loop mechanisms ensure accountability for high-risk applications.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Describe instruction fine-tuning and its relevance to AI safety.

A

Instruction fine-tuning teaches models safety-related tasks using curated datasets with specific instructions:

Embed safety concepts (e.g., toxic language detection).
Reduce harmful outputs by training on safety-related scenarios.
This enhances model alignment with human values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What is RLHF, and how does it embed safety into AI systems?

A

Reinforcement Learning from Human Feedback (RLHF) involves:

Training a reward model using human preferences.
Iteratively fine-tuning models to align with the reward model.
Evaluating responses for safety and helpfulness.
RLHF integrates safety preferences into AI systems effectively.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What is constitutional AI, and how does it enhance safety training?

A

Constitutional AI is a method for training AI systems to be helpful, honest, and harmless. It uses a set of principles to guide AI behavior and self-improvement, without relying on human feedback.

CAI’s principles are based on legal frameworks and constitutional principles, and include: Human rights, Privacy protections, Due process, and Equality before the law.

Constitutional AI uses:

Self-Critique: AI revises its outputs to align with predefined principles.
RLAIF: Reinforcement Learning from AI Feedback. AI moderates and creates preference datasets for safety fine-tuning.
This reduces reliance on manual supervision.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

How do safety thresholds in Gemini API ensure content safety?

A

Gemini API provides adjustable thresholds:

Block Low: Restricts content with even low probability of being unsafe.

Block Medium: Default threshold for most use cases.

Block High: For lenient safety requirements.

These thresholds align with use-case-specific needs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

How does Google Cloud’s Natural Language API support AI safety?

A

It provides text moderation capabilities by:

Classifying content based on safety attributes.
Assigning confidence scores for each category.
Allowing customizable thresholds for moderation decisions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Explain the trade-offs between safety and fairness in training AI models.

A

Enhanced Safety: Filtering toxic data reduces harmful outputs but risks over-correcting for sensitive topics.

Fairness Impact: Filtering can suppress representation of marginalized groups, limiting diversity in outputs.

Balancing these requires nuanced dataset curation and tuning.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

How do lexical and semantic diversity impact adversarial datasets?

A

Lexical Diversity: Ensures varied vocabulary for better robustness testing.

Semantic Diversity: Covers a broad range of meanings and contexts.

Both dimensions enhance the effectiveness of adversarial testing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

What role does safety evaluation play and how does this affect product launch decisions?

A

Safety evaluation identifies unmitigated risks, such as:

Likelihood of policy violations.
Potential harm to users.
Findings guide safeguards and launch readiness.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

How does prompt engineering support safety in generative AI?

A

Prompt engineering:

Shapes inputs to reduce risky outputs.

Uses control tokens or style transfers to steer model behavior.

Works alongside tuned models for maximum safety.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

What are semi-scripted outputs, and when are they useful?

A

Semi-scripted outputs:

Combine AI generation with pre-defined messages.

Explain safety restrictions to users effectively.

They enhance transparency while mitigating harmful responses.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

What are the safety categories and confidence levels used in Gemini?

A

Categories include harassment, hate speech, sexually explicit, and dangerous content.
Confidence levels: Negligible, Low, Medium, and High.
Thresholds determine whether content is blocked or allowed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

What are Google’s AI principles related to fairness, and why is it important in machine learning?

A

Google’s second AI principle is to avoid creating or reinforcing unfair bias. Fairness in AI ensures equity, inclusion, and ethical decision-making across diverse applications, including high-stakes domains like healthcare, hiring, and lending. Achieving fairness mitigates negative societal impacts and fosters trust in AI systems.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

Define bias in the context of AI, and provide examples of five common biases. (data collection biases)

A

Bias refers to stereotyping or favouritism towards certain groups or perspectives, often due to data or model design.

Examples:

Reporting Bias: Over-representation of unusual events in datasets.

Automation Bias: Over-reliance on AI outputs, even if incorrect.

Selection Bias: Non-representative data sampling.

Group Attribution Bias: Generalizing traits from individuals to groups.

Implicit Bias: Hidden assumptions based on personal experience.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

What is selection bias, and what are its three subtypes?

A

Selection bias occurs when a dataset does not reflect real-world distributions. Subtypes:

Coverage Bias: Incomplete representation of groups.

Non-Response Bias: Gaps due to lack of participation.

Sampling Bias: Non-randomized data collection.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

What causes bias during the ML lifecycle, and how can it be mitigated?

A

Bias can arise during:

Data Collection: Sampling and reporting errors.

Model Training: Amplification of biases in training data.

Evaluation and Deployment: Feedback loops introducing new biases.

Mitigation includes careful dataset curation, bias-aware training, and post-deployment monitoring.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

How is fairness defined, and why is it difficult to standardize?

A

Fairness is context-dependent, encompassing equity and inclusion across sensitive variables like gender and ethnicity. Standardization is challenging because:

Fairness criteria vary across cultural, legal, and social contexts.

Metrics can be incompatible (e.g., demographic parity vs. equality of opportunity).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

Explain TensorFlow Data Validation (TFDV) and its role in identifying data bias.

A

TFDV supports:

Data Exploration: Provides statistical summaries (e.g., mean, std dev).

Data Slicing: Analyzes subsets (e.g., location-based distributions).

Schema Inference: Automates validation criteria.

Anomaly Detection: Flags issues like missing values or skewed distributions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

What is the What-If Tool, and how does it facilitate fairness analysis?

A

The What-If Tool allows:

Visualization of dataset interactions and model predictions.

Counterfactual Analysis: Tests sensitivity to feature changes.

Flip rate metrics: Quantifies prediction changes when sensitive features vary.

Slicing: Evaluates performance across demographic groups.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

How does TensorFlow Model Analysis (TFMA) assist in fairness evaluation?

A

TFMA:

Analyzes model performance using fairness metrics.

Slices data by sensitive features (e.g., racial group) to detect gaps.

Automates validation in MLOps pipelines.

Links to fairness indicators for deeper insights.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

What techniques can mitigate bias during data preparation?

A

Diversify data sources (e.g., new data collection).

Balance datasets via upsampling or downsampling.

Use synthetic data to augment underrepresented groups.

Relabel data to correct harmful or outdated labels.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

Describe the Monk Skin Tone (MST) scale and its purpose in fairness.

A

The MST scale, developed in partnership with Google, provides a 10-shade range for evaluating skin tone representation in datasets. It ensures inclusivity and mitigates biases in facial recognition or image-based systems.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

How does threshold calibration address fairness issues in ML systems?

A

Threshold calibration adjusts classification cutoffs for fairness.
Example: In loan approvals, thresholds can be tuned separately for groups (e.g., based on demographic parity or equality of opportunity) to address systemic disparities.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

What are demographic parity and equality of opportunity?

A

Demographic Parity: Equal prediction rates across groups.

Equality of Opportunity: Equal true positive rates for eligible groups.

Each aligns fairness goals with specific use cases (e.g., access vs. success rates).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

How do MinDiff and Counterfactual Logit Pairing (CLP) improve fairness during model training?

A

MinDiff: Minimizes prediction distribution gaps across sensitive subgroups.

CLP: Reduces sensitivity to changes in counterfactual examples by penalizing inconsistent logits during training.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

What is flip rate, and why is it important in fairness evaluation?

A

Flip rate measures how frequently predictions change when sensitive features are altered (e.g., gender). A lower flip rate indicates higher robustness and fairness.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

How can fairness trade-offs be addressed in ML systems?

A

Fairness trade-offs require prioritization based on context:

Define fairness metrics relevant to stakeholders.

Use tools like the Aequitas Fairness Tree for guidance.

Balance conflicting goals through iterative evaluation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
56
Q

How does relabeling data mitigate bias in models?

A

Relabeling corrects harmful annotations and updates to modern standards.

Example: Sentiment analysis for movie reviews may remove stereotypical labels to prevent biased associations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
57
Q

What challenges arise when training models on synthetic data?

A

Models may overfit synthetic patterns, leading to performance issues.

Domain gaps can complicate adaptation to real-world data.

Synthetic examples may unintentionally introduce biases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
58
Q

Describe the fairness factors tested in threshold calibration.

A

Fairness constraints include:

Demographic Parity: Equal outcomes across groups.

Equality of Odds: Equal error rates (false positives/negatives) across groups.

Equality of Opportunity: Equal true positive rates.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
59
Q

What is counterfactual fairness, and how does CLP enforce it?

A

Counterfactual fairness ensures predictions are unaffected by sensitive attribute changes. CLP enforces it by minimizing prediction differences in counterfactual scenarios using added loss terms.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
60
Q

How can fairness indicators in TFMA guide decision-making?

A

Fairness indicators in TFMA evaluate model performance using multiple fairness metrics, identifying trade-offs and guiding actions like threshold adjustments or retraining with MinDiff or CLP.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
61
Q

What is Responsible AI, and why is it necessary?

A

Responsible AI refers to the ethical development and deployment of AI systems by understanding and mitigating issues, limitations, or unintended consequences. It ensures that AI is socially beneficial, trustworthy, and accountable. Without Responsible AI practices, even well-intentioned systems can cause ethical issues, reduce user trust, or fail to achieve their intended benefits.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
62
Q

What are Google’s AI principles, and how do they guide AI development?

A

Google’s AI principles provide a framework for developing ethical AI:

Be socially beneficial.

Avoid creating or reinforcing unfair bias.

Be built and tested for safety.

Be accountable to people.

Incorporate privacy design principles.

Uphold high standards of scientific excellence.

Be made available for beneficial uses aligned with these principles.

They guide AI projects by setting boundaries on what is acceptable, ensuring safety, fairness, and accountability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
63
Q

What are the four areas in which Google will not pursue AI applications?

A

Google will not pursue AI applications in the following areas:

Technologies that cause or are likely to cause harm.
Weapons or technologies designed to facilitate injury.
Technologies for surveillance that violate internationally accepted norms.
Technologies contravening widely accepted principles of international law and human rights.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
64
Q

How does responsible AI differ from legal compliance?

A

Responsible AI extends beyond legal compliance:

Ethics: Focuses on what ought to be done, even if laws don’t mandate it.

Law: Codified rules derived from ethical principles.

Responsible AI incorporates ethical considerations, such as fairness and accountability, that may not yet be codified in regulations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
65
Q

Why is fairness a central theme in Responsible AI?

A

Fairness ensures AI systems do not create or reinforce biases related to sensitive characteristics like race, gender, or ability. It is context-dependent and requires continuous evaluation to prevent harm or inequity, especially in high-stakes applications like hiring or criminal justice.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
66
Q

What role do humans play in Responsible AI?

A

Humans are central to Responsible AI:

Design datasets and models.
Make deployment decisions.
Evaluate and monitor performance. Human decisions reflect personal values, which underscores the need for diverse perspectives and ethical considerations throughout the AI lifecycle.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
67
Q

What are the six recommended practices for Responsible AI development?

A

Use a human-centered design approach.

Define and assess multiple metrics during training and monitoring.

Directly examine raw data.

Be aware of dataset and model limitations.

Test the system thoroughly to ensure proper functioning.

Continuously monitor and update the system post-deployment.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
68
Q

What is human-centered design, and why is it important for Responsible AI?

A

Human-centered design focuses on understanding how users interact with AI systems:

Involves diverse user groups to ensure inclusivity.

Models adverse feedback early in the design process.

Ensures clarity, control, and actionable outputs for users.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
69
Q

How does Google Flights incorporate Responsible AI practices?

A

Google Flights employs:

Transparency: Explaining predictions and data sources.

Actionable Insights: Providing clear indicators like “high,” “typical,” or “low” prices.

Iterative User Research: Adapting design based on user trust and understanding.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
70
Q

Why is transparency critical in Responsible AI?

A

Transparency builds trust by:

Allowing users to understand how decisions are made.

Offering explanations for predictions and recommendations.

Ensuring ethical practices and accountability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
71
Q

How does monitoring improve Responsible AI systems post-deployment?

A

Monitoring ensures models remain effective in dynamic real-world conditions by:

Detecting input drift.
Gathering user feedback.
Updating models based on new data and behaviours.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
72
Q

What are the risks of failing to build trust in AI systems?

A

Reduced adoption by users or organizations.

Ethical controversies or public backlash.

Potential harm to stakeholders affected by AI decisions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
73
Q

How can metrics ensure Responsible AI development?

A

Metrics provide quantitative benchmarks for:

User feedback.

System performance.

Equity across demographic subgroups. Metrics like recall and precision ensure models align with their intended goals.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
74
Q

What is the significance of explainability in Responsible AI?

A

Explainability allows:

Stakeholders to understand and trust AI outputs.

Identification of biases or errors in decision-making.

Users to appeal or challenge AI-based decisions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
75
Q

How can raw data examination improve Responsible AI outcomes?

A

Analyzing raw data ensures:

Data accuracy and completeness.

Representation of all user groups.

Mitigation of training-serving skew and sampling bias.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
76
Q

What is training-serving skew, and how can it be mitigated?

A

Training-serving skew occurs when data used in training differs from real-world serving data.
Mitigation involves:

Adjusting training objectives.

Ensuring representative evaluation datasets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
77
Q

What role does the “poka-yoke” principle play in Responsible AI testing?

A

The poka-yoke principle builds quality checks into systems to:

Prevent failures (e.g., missing features triggering system alerts).
Ensure AI outputs only when conditions are met.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
78
Q

Why is iterative user testing crucial for Responsible AI?

A

Iterative testing:

Captures diverse user needs and perspectives.

Identifies unintended consequences.

Improves system usability and trustworthiness.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
79
Q

What are Google’s design principles for price intelligence in Google Flights?

A

The design principles are:

Honest: Provide clear and truthful insights.

Actionable: Help users make informed decisions.

Concise yet explorable: Deliver useful summaries with deeper details available.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
80
Q

How does Responsible AI contribute to innovation?

A

Ethical development fosters:

Increased trust in AI systems.

Better adoption rates in enterprises.

Encouragement of creative, user-focused solutions that align with societal values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
81
Q

Explain the core architectural concept of TensorFlow’s computation model and how it enables language and hardware portability.

A

TensorFlow uses a directed acyclic graph (DAG) to represent computations. This graph is a language-independent representation that allows the same model to be:

Built in Python
Stored in a saved model
Restored and executed in different languages (e.g., C++)
Run on multiple hardware platforms (CPUs, GPUs, TPUs)

This approach is analogous to Java’s bytecode and JVM, providing a universal representation that can be efficiently executed across different environments. The TensorFlow execution engine, written in C++, optimizes the graph for specific hardware capabilities, enabling flexible model deployment from cloud training to edge device inference.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
82
Q

Describe the TensorFlow API hierarchy and explain the significance of each layer of abstraction.

A

TensorFlow’s API hierarchy consists of:

1) Hardware Implementation Layer: Low-level platform-specific implementations

2) C++ API: For creating custom TensorFlow operations

3) Core Python API: Numeric processing (add, subtract, matrix multiply)

4) Python Modules: High-level neural network components (layers, metrics, losses)

5) High-Level APIs (Keras, Estimators):

Simplified model definition
Distributed training
Data preprocessing
Model compilation and training
Checkpointing and serving

The hierarchy allows developers to choose the appropriate level of abstraction, from low-level hardware manipulation to high-level model creation with minimal code.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
83
Q

What are tensors in TensorFlow, and how do they differ from traditional arrays?

A

Tensors are n-dimensional arrays of data in TensorFlow, characterized by:

Scalars (0D): Single numbers
Vectors (1D): Arrays of numbers
Matrices (2D): Rectangular arrays
3D/4D Tensors: Stacked matrices with increasing dimensions

Key differences from traditional arrays:

Can be created as constants (tf.constant) or variables (tf.variable)
Variables allow modifiable values, critical for updating model weights
Support automatic differentiation
Designed for efficient numerical computation across different hardware

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
84
Q

Explain the concept of automatic differentiation in TensorFlow using GradientTape.

A

Automatic differentiation in TensorFlow allows automatic calculation of partial derivatives through:

Forward Pass: TensorFlow records operations in order
Backward Pass: Uses GradientTape to:

Track operations executed within its context
Compute gradients using reverse-mode differentiation
Enable automatic calculation of derivatives for loss functions

The process involves:

Tracking computational graph operations
Storing operation sequence
Reversing the graph to compute gradients
Supporting custom gradient calculations for numerical stability or optimization

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
85
Q

How does TensorFlow enable model portability between cloud and edge devices?

A

TensorFlow facilitates model portability through:

Training models on powerful cloud infrastructure
Exporting trained models to edge devices (mobile phones, embedded systems)
Reducing model complexity for edge deployment
Enabling offline inference

Practical example: Google Translate app

Full translation model trained in the cloud
Reduced, optimized model stored on the phone
Allows offline translation
Trades some model complexity for:

Faster response times
Reduced computational requirements
Enhanced privacy
Improved user experience

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
86
Q

What is the significance of tf.variable in TensorFlow model training?

A

tf.variable is crucial for machine learning because:

Represents trainable parameters (weights, biases)
Allows modification during training
Supports assignment methods (assign, assign_add, assign_sub)
Fixes type and shape after initial construction
Enables automatic gradient computation
Tracks parameters that change during optimization processes

Key characteristics:

Mutable tensor type
Essential for updating neural network weights
Integral to gradient-based learning algorithms
Supports efficient parameter updates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
87
Q

Describe the shape manipulation techniques in TensorFlow for tensor transformations.

A

TensorFlow provides several tensor shape manipulation methods:

Stacking: Combining tensors along new dimensions

Increases tensor rank
Creates higher-dimensional representations

Slicing: Extracting specific tensor segments

Zero-indexed access
Can extract rows, columns, or specific elements

Reshaping (tf.reshape):

Changes tensor dimensions while preserving total element count
Rearranges elements systematically
Maintains data integrity across transformations

Example: 2x3 matrix can be reshaped to 3x2 by row-wise element redistribution
These techniques enable flexible data preprocessing and feature engineering in machine learning workflows.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
88
Q

Explain how TensorFlow supports distributed machine learning training.

A

TensorFlow supports distributed machine learning through:

High-level APIs handling distributed training complexities
Automatic device placement
Memory management across multiple devices/machines
Seamless scaling of training processes

Key distributed training capabilities:

Parallel computing across GPUs/TPUs
Synchronization of model parameters
Efficient gradient aggregation
Abstraction of low-level distributed computing details
Support for various distribution strategies

Recommended approach: Use high-level APIs like Estimators to manage distributed training complexity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
89
Q

What are the key differences between tf.constant and tf.variable?

A

Comparison of tf.constant and tf.variable:

tf.constant:

Immutable values
Fixed throughout computation
Suitable for static data
No modification after creation

tf.variable:

Mutable tensor
Can be modified during training
Critical for updating model weights
Supports assignment methods
Enables gradient computation
Fixed type and shape after initial construction

90
Q

Describe TensorFlow’s approach to gradient computation and its importance in machine learning.

A

TensorFlow’s gradient computation involves:

Automatic differentiation mechanism
Computational graph tracking
Reverse-mode differentiation

Key components:

Forward Pass: Record computational operations
Backward Pass:

Traverse operations in reverse
Compute partial derivatives
Calculate gradients for each variable

Significance:

Automates complex derivative calculations
Enables efficient optimization
Supports various machine learning algorithms
Reduces manual gradient computation complexity

Mechanism: GradientTape records operations, allowing efficient gradient calculation.

91
Q

Discuss the role of Cloud AI Platform (CAIP) in the TensorFlow ecosystem.

A

Cloud AI Platform (CAIP) provides:

Fully hosted TensorFlow environment
Managed service across API abstraction levels
Cluster-based TensorFlow execution
No software installation required
Serverless machine learning infrastructure
Seamless scaling of computational resources

92
Q

How does TensorFlow enable hardware-agnostic machine learning development?

A

TensorFlow achieves hardware agnosticism through:

Directed acyclic graph (DAG) representation
Language-independent computation model
Execution engine optimized for specific hardware
Support for multiple platforms (CPUs, GPUs, TPUs)
Portable model deployment across different environments

93
Q

Explain the concept of tensor dimensionality in TensorFlow.

A

Tensor dimensionality progression:

Scalar (0D): Single value
Vector (1D): Single row/column of values
Matrix (2D): Rectangular array of values
3D Tensor: Stack of matrices
4D Tensor: Collection of 3D tensors

Each dimension represents:

Increased data complexity
More sophisticated representation
Enhanced computational capabilities

94
Q

What are the primary considerations when designing custom TensorFlow operations?

A

Custom TensorFlow operation design considerations:

Implement in C++ API
Register operation with TensorFlow
Provide Python wrapper
Ensure numerical stability
Optimize computational efficiency
Support automatic differentiation
Consider hardware compatibility

95
Q

Describe the mechanism of automatic differentiation in machine learning training.

A

Automatic differentiation mechanism:

Tracks computational graph operations
Records forward pass sequence
Computes gradients during backward pass
Enables efficient parameter updates
Supports complex, multi-layer neural networks
Eliminates manual gradient calculation
Facilitates optimization algorithms

96
Q

What model optimization strategies does TensorFlow support that make it compatible across different deployment environments?

A

TensorFlow model optimization strategies:

Cloud training on high-performance infrastructure

Model compression for edge devices

Reduced computational complexity

Offline inference capabilities

Platform-independent model representation

Adaptive model scaling

Performance-accuracy trade-offs

97
Q

Explain the significance of partial derivative computation in machine learning.

A

Partial derivative computation:

Determines model parameter sensitivity

Guides weight updates during training

Measures individual feature contributions

Enables gradient-based optimization

Supports complex loss function navigation

Facilitates model convergence

Provides granular parameter adjustment mechanism

98
Q

What are the implications of TensorFlow’s multi-layered API architecture?

A

TensorFlow API architecture implications:

Flexible development approach

Scalable complexity management

Supports various expertise levels

Enables low-level hardware optimization

Provides high-level model creation abstractions

Facilitates custom model development

Supports diverse machine learning workflows

99
Q

Discuss the role of GradientTape in TensorFlow’s automatic differentiation process. What features does it offer?

A

GradientTape functionality:

Context manager for gradient computation

Tracks computational operations

Enables reverse-mode differentiation

Supports custom gradient calculations

Manages computational graph traversal

Facilitates efficient derivative computation

Handles numerical stability considerations

100
Q

What features does TensorFlow offer to enable efficient numerical computation beyond machine learning?

A

TensorFlow numerical computation capabilities:

High-performance tensor operations

Hardware-optimized computation

Support for complex mathematical transformations

Efficient array manipulation

Cross-platform computational consistency

Scalable numeric processing

Generalized scientific computing framework

101
Q

What is the distinction between interpretability and transparency in the context of machine learning systems?

A

Interpretability refers to understanding the behaviour of a machine learning model, such as how inputs lead to outputs, and often involves technical and algorithmic methods. Transparency, on the other hand, is broader and focuses on clear, accessible explanations about the system’s purpose, behaviour, and decision-making process. Transparency involves documenting system components, processes, and ensuring stakeholder trust, while interpretability focuses on the inner workings and outputs of models.

102
Q

What are intrinsic and post-hoc interpretability methods? Give examples of each.

A

Intrinsic interpretability refers to models that are inherently understandable due to their structure, such as linear regression, decision trees, or Bayesian networks. Examples include analyzing the coefficients of a linear regression model or the splits in a decision tree.
Post-hoc interpretability involves techniques applied after a model is trained to explain its behavior, particularly for complex models like deep neural networks. Examples include SHAP values, LIME, integrated gradients, and permutation feature importance.

103
Q

Describe the trade-off between model complexity and interpretability, including examples of when a simpler model might be preferable.

A

As model complexity increases, interpretability generally decreases. Complex models, like deep neural networks, can capture intricate patterns in data but are harder to explain. Simpler models, such as linear regression, may offer lower accuracy but are easier to understand and debug. A simpler model is preferable when the performance gain from a complex model is marginal, or when interpretability is crucial, such as in healthcare or regulatory compliance scenarios.

104
Q

What are the three primary stakeholder groups affected by interpretability and transparency, and what is their focus?

A

Engineers: Focus on model debugging, understanding, and improving performance using interpretable methods.

Users: Focus on trust, wanting reliable and equitable model predictions without delving into technical details.

Regulators: Focus on legal and ethical compliance, using interpretability to trace predictions and ensure fairness.

105
Q

What is permutation feature importance, and how is it calculated?

A

Permutation feature importance measures the effect of each feature on model performance. It involves:

Randomly shuffling the values of one feature while keeping others unchanged.

Observing the resulting change in model error.

Features that cause a significant increase in error are deemed important.

However, this method can be misleading if shuffling introduces artificial patterns or if importance scores vary across trials due to randomness.

106
Q

What is the purpose of partial dependence plots (PDPs), and how are they generated?

A

PDPs visualize the relationship between a feature and model predictions by showing how predictions change when varying one feature while keeping others constant. To generate a PDP:

Choose a feature and vary its values systematically.

Compute the average predicted outcome across all data points for each feature value.

Plot these values to reveal trends like non-linearity or feature importance.

107
Q

Explain how LIME approximates a model locally to create explanations for individual predictions.

A

LIME (Local Interpretable Model-agnostic Explanations):

Selects an instance of interest (e.g., an image).

Generates perturbations of the input (e.g., hiding image segments).

Predicts outputs for perturbed inputs using the complex model.

Trains a simpler interpretable model (e.g., linear regression) on the perturbations and predictions.

Uses the simpler model to approximate the local behavior of the complex model, identifying which features (e.g., image regions) contributed most to the prediction.

108
Q

What are Shapley values, and why are they computationally expensive?

A

Shapley values, from cooperative game theory, quantify each feature’s contribution to a prediction by averaging its marginal impact across all possible feature combinations. The computational expense arises because calculating Shapley values for 𝑛 features involves evaluating
2𝑛 combinations. Approximations like Kernel SHAP or Tree SHAP reduce complexity while retaining utility.

109
Q

How do integrated gradients overcome gradient saturation in deep neural networks?

A

Integrated gradients calculate feature attributions by integrating gradients along a path from a baseline (e.g., all zeros) to the original input. This method captures the cumulative influence of each feature, avoiding saturation by considering gradients at multiple interpolation points, which ensures that small but significant gradients are not overlooked.

110
Q

What improvement does XRAI offer over integrated gradients?

A

XRAI improves upon integrated gradients by:

Using both black and white baselines to mitigate baseline dependency.

Ranking regions based on integrated gradient scores to highlight the most important areas of an image.

Generating region-based explanations instead of pixel-level attributions, making results more interpretable for natural images.

111
Q

What are concept-based explanations, and how does TCAV implement them?

A

Concept-based explanations use high-level human-understandable concepts (e.g., “stripedness”) instead of individual features. TCAV (Testing with Concept Activation Vectors):

Trains a classifier in the model’s intermediate feature space to distinguish between concept examples and random examples.

Uses directional derivatives to quantify the model’s sensitivity to the concept.
This helps understand how much a concept contributes to predictions.

112
Q

What are example-based explanations, and how can they improve a model’s training process?

A

Example-based explanations identify similar training examples to the input being predicted, helping users understand model decisions. For instance, if a bird is misclassified as a plane, finding similar dark silhouettes in the training data might reveal a lack of diverse bird images, prompting the collection of more varied data to improve the model.

113
Q

What are data cards and model cards, and how do they promote AI transparency?

A

Data Cards: Summarize datasets, including sources, annotation methods, intended uses, and limitations. They help stakeholders understand data provenance and suitability.

Model Cards: Describe model purposes, performance metrics, limitations, and ethical considerations. They ensure users understand model applications and boundaries, fostering trust and accountability.

114
Q

How does the SHAP library facilitate feature-based interpretability?

A

The SHAP library provides efficient approximations for Shapley values, such as Tree SHAP and Kernel SHAP, offering explanations for individual predictions and aggregating them for global insights. It supports tabular, text, and image data, but its high computational cost can limit applicability in scenarios with extensive feature sets.

115
Q

What are two workspaces provided by the Learning Interpretability Tool (LIT), and what functionalities do they offer?

A

LIT provides:

Main Workspace: Displays visualizations and interactive modules for understanding model behaviour.

Group-based Workspace: Allows comparative analysis across groups of data. Features include embedding projectors, salience maps, counterfactual analysis, and customizable metrics to evaluate model robustness and fairness.

116
Q

What interpretability techniques does Vertex Explainable AI support, and for which types of data?

A

Vertex Explainable AI supports feature-based (e.g., SHAP, integrated gradients, XRAI) and example-based explanations. It works with tabular, image, video, and text data across models like TensorFlow, AutoML, and scikit-learn, providing local and global insights into model predictions.

117
Q

Why is transparency crucial in mitigating biases in machine learning models?

A

Transparency ensures stakeholders can trace model decisions, verify compliance, and detect biases. By documenting data sources, preprocessing, and evaluation methods, teams can identify and address biases in training data or model predictions, leading to fairer AI systems.

118
Q

What challenges do regulators face when auditing complex AI models, and how can interpretability help?

A

Challenges include understanding model decisions in the context of laws, ensuring fairness, and identifying biases. Interpretability provides metadata to trace decisions to their inputs, enabling corrective actions and demonstrating compliance with regulatory standards.

119
Q

Explain the baseline selection problem in integrated gradients and how XRAI addresses it.

A

Baseline selection affects attribution results; for example, black baselines might ignore black features critical to a prediction. XRAI resolves this by combining black and white baselines and generating region-based attributions, enhancing clarity and reducing baseline dependency.

120
Q

Describe the roles of the people involved in creating data cards and model cards.

A

Data Cards: Involve producers (data creators), consumers (model builders), and end-users (decision-makers) to ensure comprehensive and actionable documentation.

Model Cards: Engage model developers, researchers, and ethical advisors to cover technical, practical, and societal implications, ensuring broad input and balanced perspectives.

121
Q

What are the key stages in an ML pipeline, and why is the design of data pipelines critical?

A

Key stages include data extraction, analysis, preparation, model training, evaluation, validation, serving, and monitoring. Data pipeline design is critical as it ensures efficient, scalable, and accurate preprocessing, which directly impacts model training and inference. Effective pipelines manage large datasets, maintain reproducibility, and optimize resource utilization.

122
Q

What is the difference between tf.data.Dataset.from_tensors and tf.data.Dataset.from_tensor_slices?

A

from_tensors: Combines the entire input into a dataset with a single element.

from_tensor_slices: Creates a dataset where each row of the input tensor forms a separate element.

For instance, from_tensor_slices is used when each example (e.g., a row in a CSV) needs to be processed independently.

123
Q

Explain the process of prefetching in TensorFlow data pipelines and its advantages.

A

Prefetching allows subsequent data batches to be prepared asynchronously while the current batch is being processed by the model. It minimizes idle times for GPUs or CPUs, ensuring better resource utilization and reduced training time. Using tf.data.AUTOTUNE optimizes this process by dynamically adjusting the buffer size.

124
Q

What are the advantages of using TFRecordDataset in TensorFlow pipelines?

A

TFRecordDataset efficiently handles large datasets by:

Allowing progressive loading of data from TFRecord files.
Supporting binary storage, reducing storage and memory overhead compared to text formats like CSV.
Being compatible with distributed training setups.
It supports operations like shuffling, mapping, and batching seamlessly.

125
Q

Describe the trade-offs between one-hot encoding and embeddings for categorical features.

A

One-hot encoding: Simple and interpretable but creates sparse, high-dimensional vectors. Memory and computation requirements grow with the number of categories.

Embeddings: Provide dense, low-dimensional representations that capture relationships between categories. While efficient, they require additional training and can overfit if the embedding size is too large.

126
Q

How does TensorFlow’s feature column API assist in feature engineering for structured data?

A

Feature columns transform raw input features into formats suitable for model training. Examples include:

numeric_column: For continuous features.

categorical_column_with_vocabulary_list: For categorical features with a known set of values.

bucketized_column: Discretizes continuous data into ranges.

embedding_column: Converts sparse categorical data into dense vectors.

Feature columns enable one-hot encoding, bucketing, and embedding seamlessly.

127
Q

What is the purpose of bucketized_column, and when should it be used?

A

bucketized_column discretizes continuous numeric features into a set of ranges or buckets, making it easier to capture non-linear relationships. It’s used when raw numeric features (e.g., latitude or longitude) are too granular and need to be grouped into meaningful ranges for better model training.

128
Q

Explain the role of embeddings in recommendation systems with an example.

A

Embeddings map high-dimensional sparse data (e.g., user IDs or movie IDs) into dense low-dimensional spaces. In a recommendation system, embeddings for users and items (e.g., movies) help capture similarities. For example, embeddings for “Star Wars” and “The Dark Knight” might be closer in the embedding space due to shared audience preferences.

129
Q

Why is embedding dimensionality considered a hyperparameter, and how is it typically chosen?

A

Embedding dimensionality determines the representation’s expressiveness and efficiency. It is chosen based on the trade-off between:

Accuracy (higher dimensions capture finer relationships).

Overfitting risk and computational cost (higher dimensions increase complexity).

A common heuristic is to start with the fourth root of the number of categories.

130
Q

How does feature crossing work, and why is it useful?

A

Feature crossing combines multiple features into a single synthetic feature to capture interactions between them. For example, crossing “property type” (house/apartment) with “location” can allow a model to learn separate weights for houses in urban vs. rural areas. This is implemented using hashed columns to manage memory efficiently.

131
Q

What preprocessing layers are available in Keras, and what tasks do they perform?

A

Keras preprocessing layers include:

TextVectorization: Tokenizes and encodes text.

Normalization: Standardizes numeric features (mean 0, std 1).

Discretization: Buckets numeric features into ranges.

CategoryEncoding: Encodes categorical features as one-hot or multi-hot vectors.

StringLookup and IntegerLookup: Maps string/integer features to indices.

These layers simplify preprocessing and ensure consistency between training and inference.

132
Q

What is the difference between placing preprocessing layers inside the model vs. outside in the data pipeline?

A

Inside the model: Ensures preprocessing is part of the model’s computation graph, making it portable and ensuring consistency during inference. Suitable for operations like normalization and augmentation that benefit from GPU acceleration.

Outside the model: Using dataset.map for preprocessing offloads computation to the CPU asynchronously. It is efficient for tasks requiring extensive parallelism.

133
Q

How does the adapt method in Keras preprocessing layers work?

A

adapt analyzes a dataset to compute necessary statistics (e.g., mean and variance for normalization, vocabulary for TextVectorization). These statistics are then stored in the layer’s state and applied to new data during training or inference, ensuring consistency.

134
Q

Describe the difference between map and flat_map in TensorFlow datasets.

A

map: Applies a one-to-one transformation to dataset elements (e.g., parsing CSV rows into features).
flat_map: Applies a one-to-many transformation, generating multiple elements from a single input (e.g., splitting a file into individual records).

135
Q

What is the role of AUTOTUNE in TensorFlow data pipelines?

A

AUTOTUNE dynamically adjusts parallelism and prefetching in data pipelines to optimize throughput. It reduces bottlenecks by allocating appropriate resources for data loading and preprocessing based on system capacity.

136
Q

How do embeddings assist in clustering and visualization tasks?

A

Embeddings project high-dimensional data into a dense, lower-dimensional space. This enables clustering of similar data points (e.g., handwritten digits) and visualization of relationships between categories. Tools like TensorBoard can visualize embeddings, revealing patterns or misclassifications.

137
Q

Why is it important to separate training and inference pipelines in production systems?

A

Separate pipelines prevent training/serving skew, where differences in preprocessing lead to inconsistent results. An inference pipeline ensures preprocessing logic is consistent with training, making models portable and reliable.

138
Q

What are the benefits of using feature-based normalization layers during model training?

A

Normalization centers features around zero with unit variance, improving convergence during training. It reduces the risk of exploding or vanishing gradients in neural networks and ensures consistent scaling across features.

139
Q

How do stateful preprocessing layers ensure consistency across training and inference?

A

Stateful layers like Normalization and TextVectorization store computed statistics (e.g., mean, variance, vocabulary). By adapting these layers on training data, the same transformation is applied during inference, ensuring consistency.

140
Q

What strategies can be used to handle large datasets that do not fit into memory during training?

A

Use tf.data API for progressive loading.

Store data in TFRecord format for efficient access.

Use sharded datasets and Dataset.list_files for distributed loading.

Prefetch and cache data to optimize GPU/CPU utilization.

Employ batching and parallel processing to manage memory efficiently.

141
Q

What is the purpose of activation functions in neural networks, and why is nonlinearity essential?

A

Activation functions introduce nonlinearity into neural networks, allowing them to learn and model complex relationships in data. Without nonlinear activation functions, a network with multiple layers would collapse into an equivalent single-layer linear model, as linear transformations are additive. Nonlinearity enables deep networks to approximate intricate patterns and represent a variety of functions.

142
Q

Compare the advantages and disadvantages of ReLU and its variants like Leaky ReLU, ELU, and GELU.

A

ReLU: Simple and efficient, avoids vanishing gradient in positive domain, but suffers from the “dying ReLU” problem where neurons can become inactive in the negative domain.

Leaky ReLU: Allows a small gradient for negative inputs, preventing inactive neurons.

ELU: Pushes activations closer to zero mean for faster convergence but is computationally more expensive.

GELU: Combines properties of ReLU and stochastic regularization for smoother gradients and better performance on specific tasks.

143
Q

Why is it recommended to use a softmax activation in the final layer for classification tasks?

A

Softmax converts logits into probabilities by normalizing them across all classes. It ensures the output values sum to one, making them interpretable as class probabilities. This is particularly useful for multi-class classification tasks where mutual exclusivity among classes is assumed.

144
Q

Describe the differences between the Keras Sequential API, Functional API, and Model Subclassing.

A

Sequential API: Simplest, for models with a single input-output stack of layers. Limited to straightforward architectures.

Functional API: More flexible, supports multi-input, multi-output models, layer sharing, and nonlinear topologies like residual connections.

Model Subclassing: Offers complete flexibility for custom architectures by subclassing tf.keras.Model. Requires manual implementation of the forward pass in the call method.

145
Q

Explain the role of optimizers in neural networks and compare SGD with Adam.

A

Optimizers adjust model weights based on the loss function to minimize error.

SGD: Simple and interpretable, but struggles with convergence in complex, non-convex spaces.

Adam: Combines the benefits of momentum and adaptive learning rates for efficient and robust convergence. It’s well-suited for large datasets and noisy gradients.

146
Q

What are callbacks in Keras, and how can they improve the training process?

A

Callbacks are utilities executed at specific stages of training (e.g., after each epoch). Examples include:

EarlyStopping: Stops training when validation performance stops improving, preventing overfitting.

TensorBoard: Visualizes metrics and model graphs.

ModelCheckpoint: Saves the model at specified intervals for later use.

147
Q

How does the Functional API handle models with shared layers or multiple inputs/outputs?

A

The Functional API uses a directed acyclic graph (DAG) of layers. Shared layers are reused by calling the same layer instance on multiple inputs. Multiple inputs and outputs are connected to the graph, specifying each input and output explicitly. This structure allows flexible and reusable architectures.

148
Q

What is overfitting in neural networks, and how does regularization help mitigate it?

A

Overfitting occurs when a model learns patterns specific to the training data, reducing generalization to unseen data. Regularization techniques like L1 (lasso) and L2 (ridge) penalties add constraints to weight magnitudes, reducing complexity and encouraging simpler models. Other methods include dropout, early stopping, and data augmentation.

149
Q

Compare L1 and L2 regularization in terms of their effects on model weights and sparsity.

A

L1 Regularization: Encourages sparsity by pushing weights to zero, making it useful for feature selection.

L2 Regularization: Penalizes large weights, promoting smoothness without necessarily driving weights to zero.

L1’s “diamond-shaped” constraint region tends to produce sparse models, while L2’s “circular” region maintains smaller but non-zero weights.

150
Q

What is the dying ReLU problem, and how can it be mitigated?

A

The dying ReLU problem occurs when neurons output zero for all inputs in the negative domain, leading to zero gradients and no weight updates. Mitigation strategies include:

Using variants like Leaky ReLU or ELU.

Ensuring proper initialization and learning rates.

151
Q

What components are connected in the process of compiling a model in Keras and the parameters involved.

A

Compiling connects the model, optimizer, and loss function. Parameters include:

Optimizer: (e.g., Adam, SGD) adjusts weights.

Loss Function: Guides the optimization (e.g., categorical_crossentropy for classification).

Metrics: (e.g., accuracy) monitors performance.

152
Q

What is the significance of the fit method in Keras, and what key arguments does it accept?

A

The fit method trains the model using labeled data. Key arguments:

epochs: Number of complete passes over the dataset.

batch_size: Number of samples per gradient update.

validation_data: For evaluating performance during training.

callbacks: For monitoring and modifying training behavior.

153
Q

Why is dropout an effective regularization technique, and how does it work?

A

Dropout prevents overfitting by randomly disabling neurons during training, forcing the network to learn redundant representations. At inference, all neurons are active but scaled down to maintain consistent outputs.

154
Q

Explain the differences between weight initialization techniques and their impact on training.

A

Random Initialization: Can lead to poor convergence.

Xavier Initialization: Scales weights based on layer size for balanced gradients.

He Initialization: Optimized for ReLU-based activations, preventing exploding or vanishing gradients.

155
Q

What are wide and deep learning models, and where are they typically applied?

A

Wide and deep models combine linear and neural network components.

Wide Component: Memorizes rules and relationships for feature interactions.

Deep Component: Generalizes from raw features.

Applications include recommendation systems and ranking problems.

156
Q

How do you save and load TensorFlow models, and what are the advantages of the SavedModel format?

A

Models are saved using the model.save() method and restored with tf.keras.models.load_model(). The SavedModel format supports portability, language neutrality, and compatibility with TensorFlow Serving for deployment.

157
Q

What is early stopping, and how does it prevent overfitting?

A

Early stopping halts training when the validation metric stops improving for a predefined number of epochs. This avoids overfitting by preventing excessive iterations that lead to memorizing training data.

158
Q

What is the purpose of batch normalization, and how does it benefit training?

A

Batch normalization normalizes activations to have zero mean and unit variance, stabilizing training. Benefits:

Faster convergence.

Reduced sensitivity to initialization.

Mitigation of internal covariate shift.

159
Q

How does the Sequential API handle dropout, and how does it differ from functional API implementation?

A

In the Sequential API, dropout is added layer-wise (e.g., model.add(Dropout(rate))). In the Functional API, dropout layers are explicitly connected to specific inputs and outputs, offering greater flexibility.

160
Q

Describe the process of serving a trained model using the Cloud AI Platform.

A

Steps include:

Save the model in SavedModel format.
Upload to the AI Platform.
Create a model and version using gcloud ai-platform commands.
Use the gcloud ai-platform predict command to make predictions with the deployed model.

161
Q

What is the purpose of the softplus activation function, and how does it differ from ReLU?

A

The softplus activation function is a smooth approximation of ReLU. Unlike ReLU, which has a sharp zero cutoff for negative inputs, softplus provides small, non-zero gradients for negative inputs, avoiding the “dying ReLU” problem. Its output is defined as ln(1+𝑒𝑥) which is differentiable and continuously smooth.

162
Q

Why are weights initialized close to zero but not exactly zero in neural networks?

A

Initializing weights close to zero helps prevent the vanishing or exploding gradient problem during backpropagation. However, initializing all weights exactly to zero can lead to symmetry, causing all neurons in the same layer to learn the same features and rendering the network ineffective.

163
Q

What are “dying neurons,” and which activation functions are most prone to this issue?

A

Dying neurons occur when an activation function produces zero output for all inputs, leading to no updates during backpropagation. ReLU is most prone to this issue due to its zero output for negative inputs. Variants like Leaky ReLU and ELU address this by allowing small negative outputs.

164
Q

How does the Gaussian Error Linear Unit (GELU) activation function work, and where is it used?

A

GELU combines ReLU’s nonlinearity with smooth stochastic properties. It approximates 𝑥⋅Φ(𝑥) where Φ(𝑥) is the Gaussian cumulative distribution function. This allows smoother transitions and better performance in NLP tasks, particularly in transformer models like BERT.

165
Q

What are some use cases where the Keras Functional API is preferred over the Sequential API?

A

The Functional API is preferred for:

Models with multiple inputs or outputs.

Architectures requiring shared layers (e.g., Siamese networks).

Nonlinear topologies like residual or multi-branch networks.

Complex structures such as wide and deep learning models.

166
Q

What are the key differences between Dropout and Batch Normalization in terms of purpose and implementation?

A

Dropout: Prevents overfitting by randomly disabling neurons during training. It is used as a regularization technique.

Batch Normalization: Normalizes activations within a mini-batch to stabilize and accelerate training. It primarily addresses internal covariate shift and is not inherently a regularization method.

167
Q

What is the role of hyperparameter lambda in regularization, and how is it tuned?

A

Lambda controls the trade-off between minimizing the loss function and penalizing model complexity. A larger lambda emphasizes simplicity, reducing overfitting, while a smaller lambda focuses on fitting the training data. Lambda is tuned through methods like grid search, random search, or Bayesian optimization.

168
Q

Explain the concept of weight decay in the context of L2 regularization.

A

Weight decay refers to the process of adding an L2 penalty to the loss function, which discourages large weights by minimizing their squared magnitude. This helps in simplifying the model and improving generalization. In neural networks, it is implemented by adding 𝜆∑𝑤2 to the loss function.

169
Q

How do you address imbalanced datasets during neural network training?

A

Resampling: Use oversampling (e.g., SMOTE) or undersampling to balance classes.

Class weights: Adjust the loss function to penalize misclassifications of minority classes more heavily.

Data augmentation: Increase minority class samples by generating synthetic data.

Focal loss: Focuses training on hard-to-classify examples.

170
Q

What is the difference between data augmentation and data preprocessing?

A

Data Preprocessing: Involves cleaning and transforming data to a standardized format before training (e.g., normalization, tokenization).

Data Augmentation: Expands the training dataset by applying transformations like rotations, flips, or noise to create new, diverse examples, improving generalization.

171
Q

What are the advantages of using SavedModel format for deploying TensorFlow models?

A

SavedModel is language-neutral, portable, and compatible with TensorFlow Serving. It enables:

Seamless deployment across platforms.
Preservation of both the model architecture and weights.
Integration with cloud services like GCP AI Platform.

172
Q

How does the concept of residual connections help in training very deep neural networks?

A

Residual connections allow the output of a layer to bypass one or more subsequent layers, addressing vanishing gradients and training instability in deep networks. By learning residual mappings, these connections simplify the optimization process and enable very deep architectures like ResNet.

173
Q

What are custom training loops in Keras, and when should they be used?

A

Custom training loops allow complete control over the training process by manually defining forward and backward passes. They are used for:

Implementing non-standard optimization techniques.

Debugging complex models.

Handling dynamic behaviors not supported by the default fit method.

174
Q

What is the difference between a training step and an epoch in neural network training?

A

Training step: A single update to model weights after processing one mini-batch of data.

Epoch: A complete pass over the entire training dataset. An epoch comprises multiple training steps.

175
Q

How does feature scaling impact neural network performance?

A

Feature scaling ensures that all features have comparable ranges, preventing dominance of features with larger scales. This accelerates convergence and avoids instability in gradient-based optimizers. Techniques include normalization (mean = 0, std = 1) and min-max scaling (range [0, 1]).

176
Q

What is TensorFlow Serving, and how does it facilitate model deployment

A

TensorFlow Serving is a high-performance, flexible system for serving ML models in production. It supports model versioning, monitoring, and scalability, allowing seamless deployment of SavedModel artifacts for real-time inference.

177
Q

What is the importance of callbacks like ReduceLROnPlateau during training?

A

ReduceLROnPlateau reduces the learning rate when a metric (e.g., validation loss) stops improving. This prevents overtraining and ensures that the model fine-tunes its weights when close to convergence.

178
Q

Explain the difference between validation loss and training loss. Why is it important to monitor both?

A

Training Loss: Measures error on the training dataset.

Validation Loss: Measures error on unseen validation data.

Monitoring both ensures the model generalizes well; divergence indicates overfitting.

179
Q

What is the primary difference between embedding layers and one-hot encoding for categorical data?

A

One-hot encoding: Creates sparse, high-dimensional vectors. Memory and computationally inefficient for large vocabularies.

Embedding layers: Create dense, low-dimensional representations that capture semantic relationships between categories.

180
Q

What are the advantages of using the TensorFlow Playground for understanding neural network behavior?

A

TensorFlow Playground provides an interactive visualization tool to:

Understand how models learn decision boundaries.

Experiment with architectures, activation functions, and regularization.

Observe overfitting and generalization in real-time with visual feedback.

181
Q

What are the prerequisite steps for training a machine learning model at scale using Vertex AI?

A

Before training at scale with Vertex AI:

Gather and prepare training data: Ensure data is clean and structured.

Upload data to an accessible online source: Use Google Cloud Storage for efficient access.

Structure training code properly: Split logic into modular files (e.g., task.py for orchestration and model.py for core ML logic).

Package training code: Use Python packaging standards (setup.py) to ensure compatibility.

182
Q

Describe the role of the task.py and model.py files in training with Vertex AI.

A

task.py: Acts as the entry point for Vertex AI, handling job-level details like parsing command-line arguments, interfacing with hyperparameter tuning, and managing output paths.

model.py: Contains the core machine learning logic, including model definition, training, and evaluation. It is invoked by task.py.

183
Q

What are the two main configurations for running jobs on Vertex AI, and how do they differ?

A

Prebuilt Container: Uses predefined Docker images with TensorFlow and other dependencies. Simplifies the setup process and is recommended for standard use cases.

Custom Container: Allows full control over the runtime environment by specifying a custom Docker image. Suitable for complex or non-standard dependencies.

184
Q

List the key fields required in a Vertex AI job specification for training.

A

Key fields include:

Region: Location to run the job.

Display name: A human-readable identifier for the job.

Python package URIs: GCS URIs of training code packages.

Worker pool spec: Machine type, replica count, and Docker image URI.

Python module: Specifies the entry point module (e.g., trainer.task).

Arguments: Training parameters like data paths, batch size, or output directory.

185
Q

How can you monitor and debug Vertex AI training jobs effectively?

A

Google Cloud Console: Provides a UI to monitor job status, logs, and resource utilization.

TensorBoard: Use for visualizing ML-specific metrics like loss, accuracy, and performance trends. Ensure summary data is saved to GCS and point TensorBoard to the relevant directory.

186
Q

What are the benefits of using a single-region bucket in Google Cloud Storage for ML training?

A

Single-region buckets offer lower latency and higher throughput for training jobs compared to multi-region buckets. They are optimized for high-performance access, critical for large-scale ML tasks.

187
Q

Explain the role of the config.yaml file in Vertex AI training.

A

The config.yaml file specifies custom job configurations, such as machine types and resource allocations. It provides flexibility for advanced setups and is overridden by command-line arguments if both specify the same field.

188
Q

How does Vertex AI facilitate distributed training, and what adjustments are required in the code?

A

Vertex AI supports distributed training by allowing multiple worker pools. Adjustments include:

Implementing distributed strategies (e.g., tf.distribute.MultiWorkerMirroredStrategy).

Synchronizing data loading and model updates across workers.

Specifying multiple worker pool specs in the job configuration.

189
Q

What is the purpose of the replica_count field in the worker pool specification?

A

The replica_count field specifies the number of replicas (machines) for a worker pool. It enables horizontal scaling by distributing workloads across multiple machines, crucial for large datasets or models.

190
Q

Why is it necessary to package training code as a Python package for Vertex AI, and how is this done?

A

Packaging ensures code is portable and can be distributed across machines. Steps include:

Write a setup.py file to define the package.

Use the python setup.py sdist command to create a source distribution.

Upload the package to GCS for Vertex AI to access.

191
Q

What is single-node and distributed training on Vertex AI?

A

Single-node training: Runs on one machine and is suitable for small-scale tasks.

Distributed training: Spreads computation across multiple machines (or GPUs) for scalability, faster training, and handling large datasets/models.

192
Q

How does TensorBoard enhance the debugging and analysis of ML training jobs on Vertex AI?

A

TensorBoard provides visualizations for metrics like loss, accuracy, and learning rate trends. It aids in understanding model performance, identifying bottlenecks, and fine-tuning hyperparameters. It integrates seamlessly with GCS for accessing logs and summary data.

193
Q

What types of predictions can be served using Vertex AI after training?

A

Online Predictions: Real-time, low-latency inference via REST APIs, ideal for applications requiring instant responses.

Batch Predictions: Process large datasets asynchronously, suitable for scenarios like bulk data analysis.

194
Q

What is the purpose of the Python module name in the Vertex AI job configuration?

A

The Python module name specifies the entry point for the training code (e.g., trainer.task). Vertex AI runs this module after installing the provided Python package, ensuring the correct workflow is executed.

195
Q

Explain the importance of specifying a machine type in the worker pool spec.

A

The machine type determines the compute resources (CPU, GPU, memory) for the training job. Selecting an appropriate type balances cost and performance based on the complexity of the model and size of the dataset.

196
Q

What is the role of the executor image URI in Vertex AI training jobs?

A

The executor image URI specifies the Docker container image that runs the training code. It ensures the environment includes the necessary dependencies (e.g., TensorFlow, Python libraries) for seamless execution.

197
Q

Why is logging insufficient for investigating ML performance, and what tools are better suited?

A

Logging captures system-level details like exceptions and resource usage but lacks insights into ML-specific metrics (e.g., loss, accuracy). Tools like TensorBoard are better suited for monitoring and analyzing model training and performance.

198
Q

How do REST APIs enable scalable predictions with Vertex AI?

A

REST APIs standardize prediction interfaces, allowing applications in any language to interact with the trained model. This scalability ensures efficient handling of large volumes of prediction requests in real-time or batch mode.

199
Q

What are the advantages of using prebuilt containers for Vertex AI training jobs?

A

Prebuilt containers simplify setup by providing a ready-to-use environment with TensorFlow and common dependencies. They eliminate the need for custom Docker images, reducing configuration complexity for standard ML tasks.

200
Q

Describe a typical workflow for training a TensorFlow model at scale with Vertex AI.

A

Prepare and upload training data to GCS.

Modularize training code (task.py and model.py) and package it with setup.py.

Submit a training job via gcloud ai custom-jobs create, specifying machine type, region, and other configurations.

Monitor progress using the GCP Console and TensorBoard.

Deploy the trained model for predictions via Vertex AI’s REST APIs.

201
Q

What are the advantages of using the Google Cloud Console for monitoring Vertex AI training jobs?

A

The Google Cloud Console provides:

Real-time monitoring of job status and resource usage (CPU, GPU, memory).

Access to logs for debugging issues.

A user-friendly interface to visualize job parameters and configurations.

Integration with other GCP tools for workflow management.

202
Q

What is the significance of using YAML configuration files in Vertex AI jobs?

A

YAML configuration files provide a structured way to define complex job specifications, including:

Worker pool specs.

Machine types and replica counts.

Environment variables. These files ensure reproducibility and ease of modifying configurations for future jobs.

203
Q

How does Vertex AI handle hyperparameter tuning, and what role does task.py play in this process?

A

Vertex AI’s hyperparameter tuning service iterates over different parameter combinations to optimize model performance. The task.py script interfaces with the hyperparameter service, parses the assigned parameters, and adjusts the training process accordingly.

204
Q

What is the purpose of the Cloud Storage URI in the python-package-uris field?

A

The Cloud Storage URI specifies the location of the packaged training code and dependencies. Vertex AI retrieves this package to execute the training job, ensuring the code is accessible across all worker nodes.

205
Q

What are worker pool specs, and how are they configured for distributed training?

A

Worker pool specs define the compute resources for each role in a distributed training setup. Configurations include:

Machine type (e.g., n1-standard-8, A100 GPU).

Replica count.

Executor image URI.

Each pool can be tailored for roles like parameter servers, chief workers, or evaluation tasks.

206
Q

How can you ensure data locality when training models with Vertex AI?

A

To ensure data locality:

Use single-region Cloud Storage buckets near the training region.

Match the region of the training job with the region of the data.

Utilize regional endpoints for low-latency access.

207
Q

What is the role of tf.distribute.Strategy in distributed training, and which strategies are supported?

A

tf.distribute.Strategy simplifies distributed training by abstracting the complexities of synchronization and parallelism. Supported strategies include:

MultiWorkerMirroredStrategy: For synchronous training across multiple workers.

TPUStrategy: For Tensor Processing Unit (TPU) training.

ParameterServerStrategy: For asynchronous training with parameter servers.

208
Q

What is the difference between synchronous and asynchronous distributed training?

A

Synchronous Training: All workers process a mini-batch and synchronize updates to the model after each step. Ensures consistency but can be slower if workers have imbalanced workloads.

Asynchronous Training: Workers update the model independently. This improves speed but risks stale updates and inconsistency.

209
Q

How can you debug slow training performance on Vertex AI?

A

Debugging slow training involves:

Monitoring resource utilization in the Cloud Console.
Ensuring proper prefetching and sharding of data.
Verifying balanced workload distribution in distributed setups.
Using TensorBoard to identify bottlenecks in data loading or gradient computation.

210
Q

What are the benefits of using TensorBoard with Vertex AI, and how do you set it up?

A

Benefits include:

Visualizing metrics (loss, accuracy) over epochs.

Tracking resource utilization and profiling.

Comparing results across multiple training jobs.

Setup involves saving summary data to a GCS directory during training and pointing TensorBoard to this location.

211
Q

Explain the use of preemptible VMs in Vertex AI training jobs.

A

Preemptible VMs are cost-effective instances that can be terminated by Google Cloud when resources are needed elsewhere. They are suitable for non-critical or checkpointed workloads, reducing costs while leveraging large-scale compute.

212
Q

What are the typical challenges of distributed training, and how does Vertex AI mitigate them?

A

Challenges include:

Synchronization overhead.

Data sharding and transfer inefficiencies.

Model consistency issues in asynchronous setups.

Vertex AI mitigates these by providing prebuilt strategies (tf.distribute.Strategy), optimized hardware configurations, and seamless integration with GCS for data sharing.

213
Q

What is the purpose of the output-dir argument in Vertex AI training jobs?

A

The output-dir specifies where to store training artifacts like logs, checkpoints, and models. Typically, this is a GCS path that ensures outputs are accessible for subsequent evaluation or deployment.

214
Q

How do you handle dependency management for Python packages in Vertex AI?

A

Dependencies are managed by:

Including them in the setup.py file of the training package.

Using a requirements.txt file for pip installations.

Building custom Docker images with pre-installed libraries for advanced needs.

215
Q

What is the difference between Vertex AI online and batch predictions, and when should each be used?

A

Online Predictions: For real-time, low-latency inference (e.g., user-facing applications).

Batch Predictions: For processing large datasets asynchronously (e.g., monthly reports, bulk image analysis).

216
Q

How does the gcloud ai custom-jobs create command help in submitting training jobs?

A

This command allows users to define and submit training jobs by specifying:

Job configurations (region, machine type, package URIs).

Python module to execute.

Additional arguments like batch_size or learning_rate.

217
Q

What strategies can be used to optimize the cost of Vertex AI training jobs?

A

Use preemptible VMs.

Select optimal machine types based on workload.

Utilize single-region storage buckets.

Monitor and adjust replica counts to balance speed and cost.

218
Q

Why is the replica count often set to 1 for single-node training?

A

In single-node training, only one machine processes the entire workload, so additional replicas are unnecessary. This minimizes cost and avoids resource contention.

219
Q

What considerations should be made when deploying a model trained on Vertex AI?

A

Exporting the model in a compatible format (e.g., SavedModel).

Choosing an appropriate endpoint for online or batch predictions.

Ensuring the deployment region aligns with the training data region.

Using monitoring tools to track prediction performance.

220
Q

How does Vertex AI ensure scalability and reliability during training and serving?

A

Vertex AI achieves scalability and reliability through:

Distributed training capabilities.

Flexible resource provisioning (e.g., GPU, TPU support).

Managed REST APIs for serving.

Built-in monitoring and logging for continuous evaluation.