Optimizing Foundation Models Flashcards

Question 1

Q

What are vector embeddings? How do they play a part in RAG?

Answer

A

Embedding is the process by which text, images, and audio are given numerical representation in a vector space. Embedding is usually performed by a machine learning (ML) model.

In RAG, enterprise datasets, such as documents, images and audio, are passed to ML models as tokens and are vectorized. These vectors in an n-dimensional space, along with the metadata about them, are stored in purpose-built vector databases for faster retrieval.

Question 2

Q

What is a vector database?

Answer

A

Compactly stores billions of high-dimensional vectors representing works and entities.
They provide ultra-fast similarity searches across these billions of vectors in real time
Uses K-NN for searches.
AWS Services for VDBs - OpenSearch, pgvector extension in RDS, Kendra

Question 3

Q

What are Agents in an AI system?

Answer

A

Agents interact with the environment to perform intermediary operations
Coordinate multi-step functions
Example of an agent:
A chatbot may have an agent to modify/reset a customer’s password or phone plan
Another agent may send a CSAT survey to the customer when the conversation ends.

Question 4

Q

How do you evaluate a Gen AI system?

Answer

A

Human Evaluation - evaluates user experience, contextual appropriateness, creativity, and flexibility.
Benchmark datasets - a quantitative way to evaluate generative AI models (e.g. Accuracy, Speed, Scalability).

Question 5

Q

What is involved in creating a benchmark dataset?

Answer

A

SMEs have to do this manually.
They create intelligent questions.
Then they craft answers for them.
These datasets are then used to judge the performance of the model.
A “judge model” could be used to automate this process - i.e. a Judge Model takes the output of the model under evaluation and compares it to the benchmark dataset created by the SME and issue a grading score.

Question 6

Q

What are the benefits of fine tuning?

Answer

A

Increases specificity
Improves accuracy
Reduces biases
Boosts efficiency

Question 7

Q

What are the different types of FT?

Answer

A

Instruction tuning - involves retraining the model on a new dataset that consists of prompts followed by the desired outputs.
Reinforcement learning from human feedback (RLHF): uses a reward model based on human feedback.
Adapting models for specific domains - e.g. legal or healthcare
Continuous pre-training - the initial training phase is extended to keep the model current.

Question 8

Q

What are the key steps in data preparation for fine tuning?

Answer

A

Data curation - more rigorous that for base FM. High impact data, highly relevant, labelled.
Labeling - accurate labeling is essential
Governance and compliance - specialized data to be handled with care
Bias checking - ensure data is balanced and does not introduce any new bias.
Feedback integration - integrating user feedback back into the training process.

Question 9

Q

What are a few standard metrics for evaluating LLMs?

Answer

A

ROUGE - evaluate automatic summarization of texts, in addition to machine translation quality in NLP; e.g. measure overlaps, unigrams, bigrams and n-grams between machine and human generated reference text. (effectively ensures completeness of information)
BLEU - evaluate the quality of text that has been machine-translated from one natural language to another (accurate inclusion of critical features).
BERT - evaluate the quality of text-generation tasks; measures the cosine similarity between generated and reference texts. Effectively estimates the semantic appropriateness.

Optimizing Foundation Models Flashcards

(9 cards)