Vector Databases - Pinecone Flashcards
what is a vector
mathematical representation of features or attributes
can have 3 dimensions or thousands
devs: an array containing numerical values
they are ideal data structure for ML algos
help you understand whether there is similarity between concepts
what is a vector embedding
long list of numbers, each describing a feature of the data object
ML algos need numbers to work with – a way to numerically represent unstructured data without losing it’s semantic meaning (word, sentence, doc, image, video, audio, etc)
embedding model is what takes raw data and converts into a vector embedding
- model can be domain specific (takes a lot of time)
- use pre-trained models (words - Glove, sentence USE, image - ResNet)
use case: an app would use vector embedding as its query and produce other vector embeddings which are similar to it
Semantic search
image search (image similarity)
audio search (audio similarity)
Vectors vs Embedding
While embeddings and vectors can be used interchangeably in the context of vector embeddings,
“embeddings” emphasizes the notion of representing data in a meaningful and structured way,
while “vectors” refers to the numerical representation itself.
Most popular use cases for vector database is…
search (specifically semantic search)
- any areas that require semantic understanding or matching of data
- NLP, Comp Vision, Recomm Systems
apps that want to find similar products, movies, books, songs, etc
recommendation systems
don’t need predefined criteria or exact matches like traditional dbs, can find based on contextual meaning
ex: doc based on topic and sentiment
What is Vector Search (Similarity) - how can it be done
ability to find and retrieve similar objects by searching for objects close in the vector space
comparing vector embedding and determining similarity is essential for semantic search, recomm systems, anomaly detection
several types of measurements:
1. squared euclidean (l2) - straight line measure
2. manhattan (l1) - sum of lengths
3. cosine similarity (angle of two vectors)
4. dot product (angle and magnitude)
What is Vector indexing - why is it important
is the process of organizing vector embeddings in a way that data can be retrieved efficiently - this is what makes querying faster
ANN (approx nearest neighbor) - pre calc distances between v embeddings and organize and store similar vectors close to each other
ex: “dog” and “wolf” close to one another bc dog is descendent of wolf but also “cat” bc cat and dog are both common hh pets
these would be far from fruits like “apple” or “banana” given the low similarity
What are types of ANN algos (how you index)
Hierarchical Navigable Small World (HNSW) - top performing index for sim search
Cluster based
proximity based
tree based
hash based
compression based
What do ANN algos do compared to kNN?
using kNN is computational expensive and time consuming
ANN = Small change in accuracy for huge gain in speed
recall tradeoff (accuracy)
latency (milliseconds)
throughput (queries per sec)
import time
What are LLMs and what are they doing?
LLMs can summarize, paraphrase and compress source knowledge into realistic language - they are voracious readers
LLMs are trying to build a statistical model of the language they are reading
- which combo of words or phrases are common and which aren’t
- can learn higher concepts such as relationships between words
- can even abstract away and quantify that you’re not actually referring to feather and caps (3rd level of comprehension)
LLMs should be used for the reasoning ability, not for the knowledge it has
The right way to think of the models that we create is a reasoning engine, not a fact database
How does Vector DBs enhance LLMs?
- Provide LLMs with LT memory
- reduce hallucinations - store domain specific context
- Retrieval augment generation (RAG) - allows LLM to reason about new data (delete data), cite its sources - a technique for enhancing the
accuracy and reliability
of generative AI models with facts fetched from external sources - question - answering app
What is RAG?
Retrieval-Augmented Generation (RAG)
the process of augmenting inputs to a LLM with context retrieved from a vector database
enhance accuracy and reliability of gen ai models
commonly used for chatbots and question-answering systems
benefits:
- scalability (reduce model size and training cost)
- accuracy - reduce hallucination
- controllability - updating and customizing knowledge base
Vector DB vs Traditional DB vs Vector Capable DB vs Vector Indexing Library
Traditional DB
- optimized to store structured data (in columns)
- leverage traditional keyword search
Vector DB
- also optimized for unstructured data and vector embeddings
- enables semantic search
Vector Capable DB (SQL, NoSQL)
- usually don’t index embeddings, slow the vector search process
Vector Indexing Library
- no real-time updates
- scalability issues (for any app that imports millions or billions of objects
Precision & Recall
Precision is a measure of quality
- high precision means algo return more relevant results than irrelevant results
Recall is a measure of quantity
- measure is algo returned most of the relevant results, regardless of how many irrelevant results were returned