lma-dex Flashcards

Question 1

Q

The basic way llama-index works is

Answer

A

You have some text files in a local directory
You create an index using
documents SimpleDirectoryReader(‘data’).load_data()
index = VectorStoreIndex.from_documents(documents)
Note: You will need to have the OPENAI_API_KEY in env to create the index
You create an artifact out of it using
index.storage_context.persist()
Then you check that into git, and load it into memory using
storage_context = StorageContext.from_defaults(persist_dir=”./storage”)
index = load_index_from_storage(storage_context)

Then in the view function you can call it
query_engine = index.as_query_engine()
response = query_engine.query(“What did the author do growing up?”)

Question 2

Q

By using their token predictor for creating an index, it seems the default VectorStoreIndex

Answer

A

does not use any tokens in its creation

from llama_index import TreeIndex, VectorStoreIndex, SimpleDirectoryReader, StorageContext, load_index_from_storage, MockLLMPredictor, ServiceContext

llm_predictor = MockLLMPredictor(max_tokens=256)
documents = SimpleDirectoryReader(‘data’).load_data()
service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor)
#index = TreeIndex.from_documents(documents, service_context=service_context)
index = VectorStoreIndex.from_documents(documents, service_context=service_context)
print(llm_predictor.last_token_usage)

Question 3

Q

The difference between Top K Retrieval vs Summarize is

Answer

A

Top K Retrieval just injects a limited amount of the relevant indexed data into the prompt while Summarize injects all the indexed data into the prompt by adding as much as it can and then taking the result and adding that to another iteration with the next batch of the indexed data until it has queried all of it.

Question 4

Q

In order to ensure a users private data is not added to anyone else’s context, I probably need to

Answer

A

Run the SQL query in realtime rather than create a stored index of everyones personal data.

Question 5

Q

One benefit of pinecone over a local index is it allows you to

Answer

A

feed it some filters inferred by the LLM so the retrieval is more relevant.

Question 6

Q

In order to be able to use OpenAI functions and Lama index at the same time, i might need to

Answer

A

Send the users query to openAI first to check if I need to call a function, and if not, send it to llama-index.

Question 7

Q

I need a good solution for being able to call functions. It might be the case that its wrong to insert a punch of context into a prompt if its really just a function call.

Whats a realistic input? Probably when I want to get information from them? So maybe I just need my bot to be aware of whether I am in data gathering mode or retrieval mode. So when I am in data gathering mode, I send directly to openAI instead of llama index, and after I have all my data, I can go back to data retrieval mode.

Or maybe i can just call it a survey bot, so the user knows its in survey mode.

I wonder if I need to use the SQL system to

I just realized that llama index assumes my database only has my own data in it, so it doesnt restrict permissions. In the meantime I might have to create a new database for every user.

Realistically, I dont need to run aggregations across all users data, just one users data at a time, so giving everyone their own table isn’t that bad.

Since SQLStructStoreIndex asks for a table name, maybe I can just make a new table for every user combined by every data type. Then I’ll need to create the index on the fly when they log in and cache it.

sql_index = SQLStructStoreIndex.from_documents(
[],
sql_database=sql_database,
table_name=”city_stats”,
)

To add their unstructured data, I might have to grab it from postgres, then add those nodes to the existing index on the fly, and then keep their index cached and namespaced just for them.

Its not possible to make a table on the fly. Maybe I can manage scope somewhere else.

Seems like I might need to go in here and before allowing it to make a query, i need to make sure the table has a user column restriction in it.

https://github.com/jerryjliu/llama_index/blob/aaac129580397cc4f35d11eb70f1c43e51e6d77a/llama_index/indices/struct_store/sql_query.py#LL60C16-L60C16

Seems like this library will allow me to parse the sql statement and check if it has a particular Where = ID.

sql = “SELECT * FROM table WHERE column = ‘value’”
parsed = sqlglot.parse(sql)
parsed[0].find(sqlglot.expressions.Where).sql()

Maybe the first check is to make sure the parsed array is one, and then I know I only need to check if theres a correct Where once.

To add a Where:
from sqlglot import select, condition

where = condition(“x=1”).and_(“y=1”)
select(“*”).from_(“y”).where(where).sql()

If someone asks for a query that wants data from other users, we need the LLM to tell them that they cant do querys that involve other peoples data.

This is how to use the openAI agent combined with the query tools, so use sql as query tool.

Build an index, convert that index into a tool, plug those tools into a parent

https://gpt-index.readthedocs.io/en/latest/examples/agent/openai_agent_with_query_engine.html

Seems like I take the vector database, and the structured database, and turn them into tools and them plug them into this planner tool and pass that into agent creation:

https://gpt-index.readthedocs.io/en/latest/examples/agent/openai_agent_query_plan.html

The LLM should probably know the sex of the user.

Question 8

Q

Prompting

Answer

A

The LLM can follow a protocol really well, but when there are parts of the protocol that are hard and you need to add the string, “Think step by step to be sure you get the right answer” or “Did you satisfy the requirements?” those need to be appended to the user message, rather then described in the first prompt. One potential solution is to ask the LLM to declare what step it is on, and then when I detect that step, then I need to be able to intercept that user message and add the text. How would I do that? For all answers after a step is declared, ask a fresh LLM if the user answered in the affirmative before asking the real LLM?

Another potential learning is that once you need a concrete calculation, you should show the user the input to validate first, and then push it into a function rather than have the LLM do it.

lma-dex Flashcards

(8 cards)