Guest Lectures on AI Development Flashcards
What percentage of AI projects fail to reach production in 2024?
80% of AI projects do not go into production, twice the failure rate of non-AI projects.
Why do AI systems fail to get into production?
AI systems are system-based, not just software-based.
They rely on statistical techniques with inherent uncertainty.
They require a broader concept of quality.
What are the two main components of an AI application?
AI Portion: Includes the knowledge base and inference engine.
Non-AI Portion: Supports and integrates the AI components.
What are the main roles in AI system development?
Data Scientists: Develop the knowledge base and inference engine.
Developers: Create the non-AI portion and integrate components.
Why are AI development teams interdisciplinary?
They require expertise from multiple domains, including software development, data science, and domain-specific knowledge.
What are common challenges in interdisciplinary teams?
Communication barriers: Different terminologies and styles.
Cultural clashes: Conflicting norms and values.
Power struggles: Dominant disciplines asserting control.
Resistance to change: Hesitancy to adopt new methods.
How can interdisciplinary challenges be mitigated?
Education and training in unfamiliar disciplines.
Learning vocabulary and concepts from other fields.
Time investment to build and mature teams (Tuckman’s Model: Forming, Storming, Norming, Performing).
What are the two primary types of AI models?
Narrow Machine Learning (ML) Models – Task-specific models.
Foundation Models (FMs) – General-purpose models trained on extensive, unlabeled data.
What are narrow ML models used for?
Classification: Assigning categories (e.g., spam detection).
Regression: Predicting continuous values (e.g., time estimation).
Clustering: Grouping similar data (e.g., customer segmentation).
What are common challenges with narrow ML models?
Ethical concerns & bias.
Interpretability & explainability.
Generalization & overfitting.
Robustness against adversarial attacks.
How can challenges with narrow ML models be mitigated?
Bias mitigation: Diverse datasets, ethical review boards.
Explainability: XAI techniques (LIME, SHAP, visualizations).
Overfitting reduction: Regularization, cross-validation, data augmentation.
Adversarial defense: Adversarial training, input validation, feature noise injection.
What are Foundation Models (FMs)?
Trained on massive, diverse, unlabeled datasets.
General-purpose but customizable for specific tasks.
Large Language Models (LLMs) are a subset of FMs.
What are common use cases for FMs?
Natural Language Processing (e.g., text summarization, translation).
Image generation & classification.
Code generation.
What are the key components of FM architecture?
Vector Spaces: Sentences are tokenized and represented as high-dimensional vectors.
Attention Mechanism: Determines the importance of different tokens for extracting meaning.
How can Foundation Models be customized?
Prompt Engineering: Modifying input queries.
Retrieval-Augmented Generation (RAG): Adding external data sources.
What are major risks of Foundation Models?
Data privacy & security.
Misuse & misinformation.
Deepfakes & fake content.
How can risks of Foundation Models be mitigated?
Implementing guardrails to monitor and reject problematic inputs/outputs.
Rejecting sensitive data or misinformation requests.
Why is achieving quality in AI systems harder than in traditional software?
AI introduces data quality issues that affect performance.
AI models require additional preparation steps.
Quality is impacted by both software engineering and model training.
How does AI system quality differ from traditional software quality?
Traditional Software Quality: Determined by software architecture, code quality, and development processes.
AI System Quality: Adds model quality and data quality as crucial factors.
What are key AI quality attributes?
Performance: Accuracy, latency, throughput.
Security: Defense against data poisoning and adversarial attacks.
Reliability: Stability despite data/environmental shifts.
How can data challenges be mitigated?
Data drift & environmental drift: Continuous monitoring and retraining.
Regulatory changes: Organizational unit to track legal developments.
What additional development practices impact AI quality?
Data preparation: Cleaning, resolving missing values, handling outliers.
Model training: Selecting features, hyperparameter tuning.
Testing: Checking for bias and data distribution shifts.
Tool Support: Data lineage tracking, model packaging, and deployment tools.
What is the role of software architecture in AI systems?
It isolates model changes using API layers.
It ensures system robustness despite AI model modifications.
What are the three main contributors to AI deployment failures?
Achieving AI system quality is difficult.
AI development requires interdisciplinary collaboration.
AI models are based on statistical methods, introducing inherent uncertainty.
How can AI deployment success rates improve?
Recognizing and mitigating common challenges in AI development.
Improving data quality, model robustness, and team collaboration.
Leveraging best practices in software and AI engineering.