AWS AI Practitioner Confusions Flashcards
ROUGE
Vs
BLEU
Vs
BERTScore
Vs
Perplexity
Compare N-gram matches
Vs
Evaluate Quality(Prcesion and penalizes)
Vs
Semantic similarity(Compare embeddings)
Vs
How confident the model to predict next token(lower is better)
ROUGE-N
Vs
ROUGE-L
ROUGE-N - This metric primarily assesses the fluency of the text and the extent to which it includes key ideas from the reference. Compare N-gram matches between required vs actual output
ROUGE-L - It is good at evaluating the coherence and order of the narrative in the outputs. Compare the longest sequence of words matche between required vs actual output
Fine tuned models vs Self trained models
Fine tuning a model using your data vs training a model from scratch using your data
Retrieval-augmented generation (RAG)
Vs
Instruction fine-tuning
Supplies domain-relevant data as context to produce responses based on that data.
Vs
Labeled examples and Prompt-response pairs
Regression
Vs
Classification
Predicting continuous or numerical values based on one or more input variable
Vs
Diagnostic uses which supervised learning technique
Real Toxicity
Vs
BOLD
Vs
TREX
Vs
WikiText-2
RealToxicityPrompts is a dataset for measuring the degree to which racist, sexist, or otherwise toxic language presents in Pretrained neural language models (LMs).
(Text Generation-Toxicity)
Vs
Bias in Open-ended Language Generation Dataset (BOLD) is a dataset to evaluate fairness in open-ended language generation in English language.
(Text Generation-Toxicity)
Vs
Used for Relation Extraction and Natural Language Generation.
(Text Generation-Accurcy and Robustness)
Vs
Collection of over 100 million tokens extracted from the set of verified Good and Featured articles on Wikipedia
(Text Generation-Robustness)
Gigaword
Vs
Women’s Ecommerce Clothing Reviews
Gigaword provides headline-generation on a corpus of article pairs consisting of around 4 million articles.
(Text Summarization)
Vs
Dataset revolves around the reviews written by customers
(Text Classification)
(Question and answer)
BoolQ
Vs
Natural Questions
Vs
Trivia QA
BoolQ is a question answering dataset for yes/no questions containing 15942 examples.
Vs
NaturalQuestions (NQ) contains real user questions issued to Google search, and answers found from Wikipedia by annotators.
Vs
TriviaQA is a reading comprehension dataset containing over 650K question-answer-evidence triples.
Model tuning method comparison
Prompt Engineering
RAG
Instruction based fine tuning
Domain Adaption fine tuning
Transfer Learning
Prompt Engineering - No model training needed
RAG - Use external knowledge but no FM changes or retraining. Cost for using vector dbs
Instruction based fine tuning - FM is fine tuned with instructions and change the tone of the model. Labelled data and prompt-response pairs
Domain Adaption fine tuning - Domain specific model training. Unlabled data
Transfer Learning - Widely used for image classification
Temperature
Vs
Top K
Vs
Top P
Creativity of model output
Vs
Most probable response(Number)
Vs
Most likey words(Probability value)
Amazon Q Business
Vs
Amazon Q Apps
Vs
Amazon Q Developer
Part of Amazon Bedrock with no contol to choose FMs
Vs
Create GenAI apps with use of natural language and no coding
Vs
Generate code and commands related to AWS. Scan code for vulnerabilities. Debugging and Optmization improvements
Amazon Q Business Lite
Vs
Amazon Q Business Pro
Access to the Q&A feature in Amazon Q Business
Vs
Help you solve problems, generate content, and find insights in data, and Amazon QuickSight, a generative BI assistant to help consume insights.
GPT
vs
BERT
Vs
RNN
Vs
ResNet
Vs
SVM
Vs
WaveNet
Vs
GAN
Vs
XGBoost
Generate human text or code
Vs
Translation
Vs
Speech recognition
Vs
Image recognition
Vs
Classification & Regression
Vs
Speech Sythesis
Vs
Data augmentation
Vs
Gradient boosting
KNN
Vs
K-Means
Clustering technique mdoel used in supervised learning
Vs
Clustering technique model used in unsupervised learning
Underfitting
Vs
Overfitting
High bias and low variance
Vs
Low bias and high variance
Lexicons
Vs
SSML
Vs
Voice engine
Vs
Speech Mark
Like how to speak and abbreviations
Vs
Adding <break></break>, <whisper></whisper>, etc
Vs
Different types of voice styles
Vs
Helpful for lip synching or highlighting words
Custom Labels
Vs
Content Moderation
Vs
Amazon A2I
Identify your logo on social media using Amazon Rekognition
Vs
Remove inappropriate content using Amazon Rekognition
Vs
Incorporate human review using Amazon Rekognition
Sagemaker Real time deployment
Vs
Sagemaker Serverless deployment
ResponsibleAI using various AWS Tools?
Amazon Bedrock, SageMaker Clarify, SageMaker Data Wrangler, SageMaker Model Monitor & A2I
Amazon Bedrock - Guardrails for redacting PII and block undesirable content. Do Human or Automatic Evaluation
SageMaker Clarify - FM evaluation for accuracy, robustness, toxicity and bias detection
SageMaker Data Wrangler - To fix Bias and augment the data
SageMaker Model Monitor - Quality Ananlysis in production
A2I - Human review of ML predictions
Governance using Role manager, Model cards and Dashboard
Interpretability
Vs
Explainability
Degree to which a human can understand the cause of the decision
Vs
Understand the nature and behaviour of the model
ResponsibleAI
Vs
GovernanceAI
Vs
ComplianceAI
Fairness, explainability, interpretibility, transparency, controlability,privacy, dafety, robust
Vs
Managing, optimzing and scaling org AI activities with policies, guidelines, risk managment and build public trust
Vs
Complaince to various industry standards for the AI workloads
Data Lifecycles
Vs
Data Logging
Vs
Data Residency
Vs
Data Monitoring
Vs
Data Analysis
Vs
Data Retention
Vs
Data Lineage
Collecting, processing, storage, consumption and archival
Vs
Inputs, outputs, performace metrics and system events
Vs
Where the data is processed and stored
Vs
Data Quality, identifying anomilies and data drift
Vs
Statistical analysis, visualization and exploration
Vs
Regulatory requirements, historical data for training, cost
Vs
Sources of data, licenses and terms of usage or permissions
Threat detection
Vs
Vulnerability Mgmt
Vs
Infrastructue Mgmt
Generating fake content
Vs
Identify software bugs
Vs
Secure cloud computing platform
Accuracy
Vs
Precision
Vs
Recall
Vs
F1-score
Vs
Latency
Ratio of +ve predictions
Vs
Ratio of correct and incorrect +ve predictions
Vs
Ration of correct and incorrect +ve predictions compare to actual
Vs
Average of precision and recall
Vs
Time taken by the model to predict
Posining
Vs
Jailbreaking
Vs
Prompt Leaking
Vs
Exposure
Vs
Hijacking
Introduction of malicious and bias data
Vs
Gain access to offensive, harmful content which is otherwise prevented
Vs
Leaking of prompts and inputs
Vs
Leaking of sensitive data from training corpus
Vs
Influencing the output
Logistic Regression
Vs
Support Vector Machines (SVMs)
Primarily designed for binary classification problems
Vs
SVMs are effective for classification tasks, especially in high-dimensional spaces
Pretraining
Vs
Fine Tuning
Uses unlabeled data
Vs
Uses labeled data
Data drift
Vs
Hellucination
Input data changes which degrades the output
Vs
Output appears factutal but misleading and incorrect
Techniques to prevent overfitting
Easy Stopping
Vs
Pruning
Vs
Regularization
Vs
Ensembling
Vs
Data augmentation
Pause the training phase before noise
Vs
Identify most important feature
Vs
Apply penalty value to minimal impact feature
Vs
Combine different ML models predictions
Vs
Adding small datasets each time of iteration
Shapley values
Vs
PDP
Shapley values are a local interpretability method
Vs
Provide a global view of the model’s behavior
Sampling bias
Vs
Measurement bias
Vs
Observer bias
Vs
Confirmation bias
Data used to train the model does not accurately reflect the diversity of the real-world population
Vs
Inaccuracies in data collection, such as faulty equipment or inconsistent measurement processes
Vs
Human errors or subjectivity during data analysis or observation
Vs
Selectively searching for or interpreting information to confirm existing beliefs
Linear regression
Vs
Document classification
Vs
Neural networks
Vs
Decision tree
Vs
Association rule learning
Vs
Clustering
Which learning techniques?
Supervised Learning
Vs
Semi-supervised learning
Vs
Supervised Learning
Vs
Supervised Learning
Vs
Unsupervised learning
Vs
Unsupervised learning
Embedding models
Principal component analysis
Vs
Singular value decomposition
Vs
Word2Vec
Vs
BERT
Dimentionality reduction technique
Vs
Transforms a matrix into a singular matrix
Vs
Associate words using contunius BOW or Skip-gram
Vs
Semantic similarity using N-gram matches
AWS Trainium
Vs
AWS Inferentia
ML chip that AWS purpose-built for deep learning (DL) training
Vs
ML chip purpose-built by AWS to deliver high-performance inference at a low cost
GenAI
Vs
ML
Gets features from labels
Vs
Gets labels from features
Model parallelism
Vs
Data parallelism
Splitting a model up between multiple instances or nodes
Vs
Splitting the training set in mini-batches evenly distributed across nodes
Model Parameters
Vs
Hyperparameters
Internal variables of the model
Vs
External configurations set before the training process
Multi-modal generative model
Vs
Multi-modal embedding model
Generate new output
Vs
Context-based output (Cheaper than geneartive model)
Training set
Vs
Validation set
Vs
Test set
Used to train an algorithm or ML model. The model iteratively uses the data and learns to provide the desired result.
Vs
Introduces new data to the trained model. You can use a validation set to periodically measure model performance as training is happening, and also tune any hyperparameters of the model. However, validation datasets are optional.
Vs
Used on the final trained model to assess its performance on unseen data. This helps determine how well the model generalizes.
SHapley Additive exPlanations
Vs
Differential privacy
Vs
Adversarial debiasing
Vs
Fairness-aware preprocessing
Explain model predictions and identify feature importance
Vs
Protects individual privacy
Vs
Mitigation method but is typically applied during or after training
Vs
Proactive responsible AI strategy that helps reduce bias before the model is trained
Diffusion Model
Vs
GAN
Diffusion models have gained popularity over GANs due to their ability to generate high-quality images with superior fine-grained control. They are slower than GAN
Recurrent Neural Network (RNN)
Vs
Generative Adversarial Network (GAN)
Vs
Transformer-based vision-language model
Sequential data processing, such as text or time-series analysis
Vs
Image generation, style transfer, and upscaling
Vs
Generating text descriptions from images