Context Grounding, RAG, Vector Databases Flashcards

1
Q

What is the definition of Context Grounding?

A

Context Grounding refers to the process of connecting AI-generated outputs to authoritative data sources, ensuring responses are factually accurate and contextually relevant.

Grounded AI systems reference external databases to validate information, minimizing hallucinations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the three key stages involved in the grounding process?

A
  • Data Ingestion and Indexing
  • Retrieval
  • Augmentation and Validation

Each stage plays a critical role in ensuring the AI’s outputs are based on verified information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is Retrieval-Augmented Generation (RAG)?

A

RAG enhances large language models (LLMs) by integrating real-time data retrieval into the generation process.

This allows RAG systems to access external knowledge bases dynamically.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How does the RAG architecture improve upon conventional LLMs?

A

RAG systems dynamically access external knowledge bases instead of being limited by their training cutoff dates.

This provides access to the latest information during generation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are some enterprise applications of RAG?

A
  • Customer Support (e.g., Salesforce’s Einstein GPT)
  • Financial Analysis (e.g., JPMorgan’s COiN platform)
  • Healthcare Diagnostics (e.g., IBM Watson Health)

These applications leverage RAG to improve efficiency and accuracy.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are vector databases specialized in?

A

Vector Databases enable efficient retrieval of information from unstructured data using similarity searches.

They differ from relational databases, which excel at exact matches.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What algorithms do vector databases use to find approximate nearest neighbors?

A

HNSW (Hierarchical Navigable Small World)

  • Layer 2 (Top level) → Very few nodes, loosely connected, serving as the entry point to the system.
  • Layer 1 (Middle level) → More nodes, slightly denser connections, helping refine the search.
  • Layer 0 (Bottom level) → The most detailed level with dense connections, covering all data points.

Layer 2 (Entry Layer)
O
|
O – O

Layer 1
O – O – O
\ |
O – O – O

Layer 0 (Dense Connections)
O – O – O – O – O
\ | / | / |
O–O–O–O–O–O
\ | / \ | / |
O–O—O–O

IVF (Inverted File Index)
* cat, cat book file 1
* cat, cat picture book file 2
* tractor, tractor parts catalog file 1

These algorithms facilitate fast and efficient data retrieval.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the significance of hybrid search in vector databases?

A

Hybrid search combines vector similarity with metadata filtering for precision.

This allows for more accurate search results based on additional criteria.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What challenges do grounding systems face?

A
  • Data Quality
  • Computational Overhead
  • Cross-Modal Alignment

These challenges can hinder the effectiveness and efficiency of grounding systems.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is Self Querying RAG?

A

Self-querying RAG (Retrieval-Augmented Generation) is an advanced technique that enhances the retrieval process in RAG systems by automatically analyzing and structuring user queries.
* convert natural language to structured queries
* extract metadata from user input
* combine metadata filtering with vector search for more precise retrieval

This trend aims to improve the accuracy of information retrieval.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Fill in the blank: The integration of Context Grounding, RAG, and _______ has revolutionized AI systems.

A

Vector Databases

These components work together to enhance AI’s ability to generate accurate information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

True or False: Vector databases only support textual data.

A

False

Vector databases can handle high-dimensional data, including images and audio.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a key feature of vector databases that aids real-time applications?

A

Streaming Support, which allows immediate querying of newly ingested data.

This is critical for applications like fraud detection.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

In the context of AI, what does augmentation and validation involve?

A

Injecting retrieved evidence into the AI’s prompt, constraining its output to the provided context.

This step ensures that the AI’s responses are grounded in verified information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the purpose of role-based access control in vector databases?

A

To ensure sensitive data remains compartmentalized.

This is particularly important for managing legal or confidential information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is an example of how RAG is used in healthcare?

A

Healthcare chatbots can pull the latest treatment guidelines from indexed medical journals.

This ensures that recommendations reflect current standards.

17
Q

What does the term ‘multimodal retrieval’ refer to?

A

The ability of RAG systems to process queries that reference images, audio, or code snippets alongside text.

This expands the capabilities of AI systems significantly.

18
Q

What’s an embedding?

A

RAG turns Your Question into Numbers – It first converts your words into a special number format (called an “embedding”). This number represents the meaning of your question, not just the words themselves.

19
Q

How does RAG work?

A

In Retrieval-Augmented Generation (RAG), the librarian (computer) works in these steps:

Turns Your Question into Numbers – It first converts your words into a special number format (called an “embedding”). This number represents the meaning of your question, not just the words themselves.

Looks for the Closest Matches – It searches a huge database of information, finding pieces of text that have similar numbers (meaning similar ideas).

Just like the Dewey Decimal system groups similar topics together, these embeddings help the AI find information that is related to what you’re asking.
But instead of a strict order like “Cookbooks are in 641.5”, it’s more flexible—it finds things that “feel” close in meaning rather than following exact numbers.
Reads and Summarizes the Best Information – After retrieving the most relevant info, the AI combines it with its existing knowledge to generate a clear and accurate answer.