AIP Logic & LLMs ontologize Flashcards

1
Q

Two ways of using LLMs in Python Transforms

A
  • API Endpoints
  • Palantir-provided models
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Palantir-provided models

A
  • part of AIP
  • Make working with LLMs from code more ergonomic
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

using LLMs in Python transforms

A
  • easier to rack up compute costs
  • rate limits
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Without the AIP library

A
  • need to ensure you’re within token limits
  • more configuration to loop through datasets
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

With the transforms-aip Library

A
  • Model is an input from the model library
  • Processes datasets well: each row can serve as its own prompt
  • maximizes speed given rate limits
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Using LLMs in Pipeline Builder

A
  • Quick to implement, but less finetuned control
  • Can do:
  • Classification
  • Sentiment Analysis
  • Summarization
  • Translation
  • Entity Extraction
  • Use “Empty Prompt” for more open-ended problems
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is Retrieval Augmented Generation (RAG)

A

used to augment the capabilities of LLMs by allowing them to generate responses that incorporate information they were not trained on

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Why is RAG useful?

A
  • cheaper, faster and less risky method of enabling LLMs to do useful things with your data
  • no need to pre-train the model on your own data
  • avoid risk of leaking data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are embeddings

A

Embeddings are vector representations (numbers in matrices) of text that
capture semantic meaning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Why do we need embeddings

A

Let us compute relevance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

If you ask an LLM a question, how does it know what data (the text you created embeddings for) is the most relevant for generating a response?

A

It creates embeddings from your question and then finds the data that is
the closest (in high-dimensional vector space), and therefore most likely to be relevant. It uses the most relevant data to generate a response.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How do we create
embeddings in
Foundry?

A
  1. Ingest your Data
  2. Make your data machine-readable (if needed)
  3. Chunk the text
  4. Create embeddings
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Use Media Sets

A

store PDFs, images, audio files and other non-tabular data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Use Datasets

A

store tabular data (e.g. a field that stores free text response from customer interactions)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Optical Character Recognition (OCR)

A

Perform in order to extract the text
ex. text stored as images

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

If you have audio files

A

first transcribe the text

17
Q

Context Preservation

A

By dividing texts into logical chunks (such as paragraphs or sections),
the embeddings better capture the specific context of each part.

18
Q

Improved Retrieval

A

When queries are matched against smaller, more focused chunks,
the system is more likely to retrieve the most relevant text segments
rather than entire documents.

19
Q

Scalability

A

Chunking allows parallel processing of text chunks, speeding up the embedding process

20
Q

Actions

A
  • Can edit the Ontology and interact with external systems
  • Modify objects/links in the ontology
  • Notifications
  • Webhooks / API calls
21
Q

Functions

A
  • Can only accept inputs and return outputs
  • Can’t directly edit to the Ontology
22
Q

4 components of the Use LLM

A
  1. System Prompt
  2. Provided Tools (optional)
  3. Task Prompt
  4. Output + Model/Prompting Strategy
    Configuration
23
Q

The Use LLM Block: System Prompt

A
  • Tells the LLM what its “role” is
  • Provides high-level context into the
    “frame of mind” it should adopt
24
Q

The Use LLM Block: Tools

A
  • Explicitly provided to the LLM block to use
    during processing
  • Apply Actions – existing Action Types to apply Ontology Edits
  • Calculator Tool – LLMs’ capabilities with math are still developing; provide this to improve calculation reliability
  • Call function – existing functions published on Foundry
  • Current date
  • Query objects – provision additional, but
    controlled access to certain Object Types + Link Types
25
The Use LLM Block: Task Prompt
- Instructs the LLM with specific tasks to perform - Can reference input variables – including properties if the variable is an Ontology object and the outputs of previous blocks
26
The Use LLM Block: Output Configuration
- Tells the block what type of data it should return - Selects the Model to use (i.e., OpenAI GPT4, OpenAI GPT4-Turbo, Anthropic Claude 3 Sonnet, etc.) - Prompting Strategy * Chain of Thought * Single Completion
27
3 Main Components of AIP Logic's interface
1. Inputs, blocks, and outputs configuration 2. Debugger 3. Run panel
28