AIP Logic & LLMs ontologize Flashcards
Two ways of using LLMs in Python Transforms
- API Endpoints
- Palantir-provided models
Palantir-provided models
- part of AIP
- Make working with LLMs from code more ergonomic
using LLMs in Python transforms
- easier to rack up compute costs
- rate limits
Without the AIP library
- need to ensure you’re within token limits
- more configuration to loop through datasets
With the transforms-aip Library
- Model is an input from the model library
- Processes datasets well: each row can serve as its own prompt
- maximizes speed given rate limits
Using LLMs in Pipeline Builder
- Quick to implement, but less finetuned control
- Can do:
- Classification
- Sentiment Analysis
- Summarization
- Translation
- Entity Extraction
- Use “Empty Prompt” for more open-ended problems
What is Retrieval Augmented Generation (RAG)
used to augment the capabilities of LLMs by allowing them to generate responses that incorporate information they were not trained on
Why is RAG useful?
- cheaper, faster and less risky method of enabling LLMs to do useful things with your data
- no need to pre-train the model on your own data
- avoid risk of leaking data
What are embeddings
Embeddings are vector representations (numbers in matrices) of text that
capture semantic meaning
Why do we need embeddings
Let us compute relevance
If you ask an LLM a question, how does it know what data (the text you created embeddings for) is the most relevant for generating a response?
It creates embeddings from your question and then finds the data that is
the closest (in high-dimensional vector space), and therefore most likely to be relevant. It uses the most relevant data to generate a response.
How do we create
embeddings in
Foundry?
- Ingest your Data
- Make your data machine-readable (if needed)
- Chunk the text
- Create embeddings
Use Media Sets
store PDFs, images, audio files and other non-tabular data
Use Datasets
store tabular data (e.g. a field that stores free text response from customer interactions)
Optical Character Recognition (OCR)
Perform in order to extract the text
ex. text stored as images
If you have audio files
first transcribe the text
Context Preservation
By dividing texts into logical chunks (such as paragraphs or sections),
the embeddings better capture the specific context of each part.
Improved Retrieval
When queries are matched against smaller, more focused chunks,
the system is more likely to retrieve the most relevant text segments
rather than entire documents.
Scalability
Chunking allows parallel processing of text chunks, speeding up the embedding process
Actions
- Can edit the Ontology and interact with external systems
- Modify objects/links in the ontology
- Notifications
- Webhooks / API calls
Functions
- Can only accept inputs and return outputs
- Can’t directly edit to the Ontology
4 components of the Use LLM
- System Prompt
- Provided Tools (optional)
- Task Prompt
- Output + Model/Prompting Strategy
Configuration
The Use LLM Block: System Prompt
- Tells the LLM what its “role” is
- Provides high-level context into the
“frame of mind” it should adopt
The Use LLM Block: Tools
- Explicitly provided to the LLM block to use
during processing - Apply Actions – existing Action Types to apply Ontology Edits
- Calculator Tool – LLMs’ capabilities with math are still developing; provide this to improve calculation reliability
- Call function – existing functions published on Foundry
- Current date
- Query objects – provision additional, but
controlled access to certain Object Types + Link Types