LLM Modeling Flashcards

Question

PROMT ENGINEERING

Answer 1

one shot inference etc

Answer 2

Prompt Tuning: It fine-tune the LLM but with the structured data, which is consisted of some contents like "instruction", "response" etc. - Fine Tuning: It fine-tune the LLM with unstructured data like raw text.

Answer 3

Fine- Tuned Language Net are the specific instruction used to perform fine tuning Flan - T5; FLAM-PALM is the tuning of the t% and PALM model

Answer 4

used for text summarization, compares to 1 or more reference summaries

Answer 5

I very long response can have a recall of 100% but be too wordy

Answer 6

how many extra words are there in the output?

Answer 7

A ratio of recall and precision

Answer 8

used for text translation; compares to human generated translation

Answer 9

Reinforcement Learning from Human Feedback - Tuning a model to be helpful, honest, harmless (Three HHH) Have humans tag how 'good' a response by comparing 3 options (how helpful? Or how harmful? Or how honest?)

Answer 10

Proximal Policy Optimization - a popular algorithm who helps solve reinforcement learning problems. Makes updates within a very small region (proximal) to LLM via many iterations to bad handle HHH

Answer 11

supervised learning taking your human prompts and 'rewarding' the human tagged responses [comparing classes hate vs not hate probability)

Answer 12

- where a model tries to optimize it's scores by making answers that are long and wordy Avoid reward hacking by comparing to Reference Model via KL Divergence Shift Penalty

Answer 13

allows you to scale Reinforcement Learning without human intervention. Constitutional AI (CAI) is similar to RLHF except instead of human feedback, it learns through AI feedback.

Answer 14

Distillation - train a smaller student model from a larger teacher model Post Training Quantization (PTQ) - reduce precision of model weights (aka from 32 bits to 8 bits) Pruning - remove model weights with values close to or equal to 0 (in theory makes sense to reduce , but in actuality there may not be many weights are Zero or close to 0

Answer 15

1) Out of date 2) Bad at math (can't do calculation) Hallucinations and guessing the answers it doesn't know

Answer 16

RAG-retrieval augmented generating; get the details directly from a DB/API then pass I to the model Chain of Thought --> Provide hints of how to break the problem into smaller parts [good for simple problems] Program Aided Language Models (PAL)- integrate with python to do the math and model to call python

Answer 17

Toxicity [curating training data, train guardrail models, diverse group of human annotators], Hallucination [educate/add disclaimers] Intellectual Property [not easy, machine 'unlearning', filtering blocking]`

Answer 18

1) Rogue --> compare to an expected results 2) Ask Chat GPT to grade 3) Probability checks

Answer 19

ChainPull Pass results to chain of thought's model which includes a score and a logic path

Answer 20

Generative Pre-trained Transformers, commonly known as GPT, are a family of neural network models that uses the transformer architecture and is a key advancement in artificial intelligence (AI) powering generative AI applications such as ChatGPT.

LLM Modeling Flashcards

(45 cards)