Die Idee hinter RAG 4

- More precise with respect to private data - More up-to-date information - Less Hallucination - Cheaper than FineTuning

Challenges Building RAG Systeme 5

- Prompt Design (Experimentation) - Grounding & Accuracy (Evaluation) - Privacy, Security & Compliance (Data may be sensitive) - Performance (Quality vs Quantity, Inference Time) - Integration & Adoption (UX needs feedback)

Retrieval Augmented Generation Flashcards by Jan Albrecht

GPT: Definiere Temperature

More Temperature -> More randomness

0 Temperature -> Deterministic

How well did you know this?

Not at all

Perfectly

Die Idee hinter RAG

More precise with respect to private data
More up-to-date information
Less Hallucination
Cheaper than FineTuning

How well did you know this?

Not at all

Perfectly

GPT: Definiere Top_P

Considers top_p tokes for choice of next token prediction

How well did you know this?

Not at all

Perfectly

3 Bausteine der RAG-Architektur

Indexing
Retrieval
Generation

How well did you know this?

Not at all

Perfectly

Challenges Building RAG Systeme

Prompt Design (Experimentation)
Grounding & Accuracy (Evaluation)
Privacy, Security & Compliance (Data may be sensitive)
Performance (Quality vs Quantity, Inference Time)
Integration & Adoption (UX needs feedback)

How well did you know this?

Not at all

Perfectly

Definiere Chunking/Splitting, Nenne Probleme, Nenne Techniken

Splitte ein Document für bessere Indizierung der einzelnen Segmente

-> Größere Texte sind schlechter zu vergleichen als kleine Texte

Probleme:
Welches Splitting?
Welche Chunking-Size, Separator oder Overlap?

TokenSplitting
FixedSplitting
DocumentSpecificSplitting (HTML, JSON, Markup)
Chunking - Query Extension (Extend Chunk by rest of document)

How well did you know this?

Not at all

Perfectly

Problem und Approach for finding Embedding Model

Problems:
- Different goals on “semantic encoding”
- unclear which is best
- embedding is time consuming

Approach:
- Benchmarks (HF MTEB)
- Evaluations

How well did you know this?

Not at all

Perfectly

Problem und Approach for finding VectorDB

Problems:
- Difference in Functionality
- Difference in Storage Efficiency
- Which one is best suited?

Approach:
- Pre-Select Choice of Functionality & Ease to Implement
- Benchmark them
- Consider Scaling and Metadata Storage

How well did you know this?

Not at all

Perfectly

Was sind Schlüsselfähigkeiten von VectorDB’s?

Vector Indexing: Pre-processing of vectors to speed up distance computations

Inverted Indexing: Fast full text and keyword search on raw text data (by mapping contents to their location in the databse)

Vector Quantization: Compress original vector to lower dimension

Seach Techniques: Dense, Sparse, Hybrid

How well did you know this?

Not at all

Perfectly

Nenne die zwei Typen der Quantifizierung und beschreibe sie

Scalar Quantization
- Float64 to Float32 or f16 or int8

Product Quantization
- Define M subvectors
- Perform on Clustering on each subvector
- assign NN’s centroid
- replace centroid values with an ID
- return only ids of centroids of subvectors

How well did you know this?

Not at all

Perfectly

Beschreibe Aproximate Nearest Neighbor (ANN)

KNN does not scale

ANN:
- pre calc distances between vectors to organize and store similar vectors in clusters
- search only in a cluster

-> Hierarchical Navigable Small World (HNSW)

How well did you know this?

Not at all

Perfectly

Was ist Document Composition, nenne das zugrundeliegende Problem und die Lösungen innerhalb Document Composition

Problem: Zu viel Dokumtente resultieren in irrelevante Retrievals

Lösung:
- So viel Metadaten wie möglich
- Entferne Duplikationen oder irrelevante Texte
- Standardisiere Text (zB Sprache)
- Unterschiedliche Indices für versch. Themen
- Reranking
- Dynamische Thresholding
- Text Summarization
- Diversity Ranker

How well did you know this?

Not at all

Perfectly

Was ist sind die 3 Probleme bei der Positionierung von Dokumenten in der Query?

Problem 1:
Positionierung Matters: der mittlere Part einer Query ist weniger relevant für das LLM
Adding random Documents improved Accuracy by 36%

Problem 2:
Retrieved nodes sind sehr ähnlich. Redundant und wenig relevant

Problem 3:
Top_k Cutoff schneidet relevante Dokumente. Es gibt keinen fixen guten Wert dafür

How well did you know this?

Not at all

Perfectly

Was ist ein Diversity Ranker?

Versuche diverse Statements für bessere Informationscoverage zu erhalten

Similarity Scores der retrieved Docs berechnen
Diversity Score: Summe der paarweisen Similarities zw. Dokumenten und allen anderen Dokumenten

Take only highest Diversity Scores

How well did you know this?

Not at all

Perfectly

Nenne und beschreibe 7 Reranker Strategien

LM-based agents: Use LLMs to score the relevance of the documents according to the user query

Ensemble models: Use multiple language models or algorithms to combine their pros.

Contextual reranking: Include contextual information, such as preferences and interaction history for reranking

Query expansion: Modify or extent the user query to better capture its intent (e.g., using synonyms paraphrases, etc.)

Feature-based reranking: Use features, such as term frequency, document length, and entity overlap to score the docs

Learning to rerank: Train a model to predict the most relevant documents given the user query

User feedback: Use user feedback (e.g., likes and ratings) to consider their preferences during reranking

How well did you know this?

Not at all

Perfectly

Nenne und beschreibe 4 Prompting Techniques

Study These Flashcards

Thread of Thought (ThoT): Lass das Model Themen auseinanderbrechen und zusammenfassen

Relevance: Add at the end:
„Here is the most relevant sentence in the context“

Chain of Note (CoN): Generierung von Retrieving-Notes, welche als Extra-Informationen mit in die Prompt eingefügt werden. Sorgt für robustere Ergebnisse

Chain of Verification (CoV): (1) Generiere Baseline Antwort, erstelle Verifikationsfragen für jedes Statement in der Baseline Antwort, Verifiziere und erstelle finale Antwort

Nenne 5 Prompt Guidelines

Study These Flashcards

Instructions at start or beginning
Explain document context
Not just instruct, but also explain why something should be done or not be done
Hallucination counter: “**if documents are not relevant or complete enough, say … **”
Roleplay human-to-human interaction (eg. teacher-student)

Ein vollkommenes RAG System sollte folgende 3 Eigenschaften haben

Study These Flashcards

IRA
- Iterativ
- Rekursiv
- Adaptiv

Evaluation

Nenne 6 Fehler, die ein RAG machen kann

Study These Flashcards

P4AF
- Prompt fragt nach Daten, die nicht enthalten sind
- Antwort wurde nicht hoch genug gerankt
- Antwort ist da, aber zu unspezifisch
- Antwort ist da, aber zu viel Noise/Widersprüche
- Antwort ist unvollständig
- Formatierung wurde ignoriert

Nenne 9 Task Evaluation Metrics und ordne sie in Retrieval und Generation ein

Study These Flashcards

Retrieval:
- Context Precision
- Context Relevance
- Groundedness
- Hallucination
- Recall
- Completeness

Generation:
- Answer Relevance
- Faithfulness
- Answer Similarity

Def: Context Precision

Study These Flashcards

Context&raquo_space; Ground Truth

Definition: Bewertet, ob die Ground-Truth-Elemente innerhalb des Kontexts am höchsten eingestuft werden.

Def: Context Recall

Study These Flashcards

Context&raquo_space; Ground Truth

Misst, inwieweit der abgerufene Kontext mit der annotierten Antwort übereinstimmt. Jeder Satz oder jede
Aussage in der Ground Truth wird überprüft, ob sie auf den abgerufenen Kontext zurückgeführt werden kann. Perfect Recall bedeutet,
dass jeder Satz in der Ground Truth-Antwort auf den Kontext zurückgeführt werden kann.

Def: Context Entities Recall

Study These Flashcards

Context&raquo_space; Ground Truth

Entities in Context vs Entities in Ground Truth

Def: Answer Semantic Similarity

Study These Flashcards

Answer&raquo_space; Ground Truth

Misst similarity score von Ground Truth und der generierten Antwort

Def: Answer Relevance

Question >> Answer Die Idee besteht darin, herauszufinden, wie genau und präzise Sie die ursprüngliche Eingabeaufforderung anhand der Antwort und des Kontexts zurückentwickeln können.

Def: Faithfulness

Answer >> Context Der Grad der sachlichen Konsistenz der Antwort im Kontext |Claims in Context| / |Claims in Answer|

Def: Hallucination

Ähnlich zu Faithfulness, aber Context==Ground Truth -> Widersprüche zum Kontext / Nr. of Statements

Def: Groundedness Score

Summarization Score: min(Hallucination, Coverage Score)

Def: Aspect Critique Score

Bewertet, ob die gegebene Antwort nach einem vorgegebenen Aspekt, wie z.B. Harmlosigkeit, Richtigkeit, Prägnanz, Bösartigkeit, etc. übereinstimmt. Eigene Aspekte können definiert werden.

Retrieval Augmented Generation Flashcards

(29 cards)