Large Language models Flashcards
How do neural networks aim to emulate the brain’s manner of computation?
it uses lots of processors like neurons, talking to each other.
what do Classifier Networks do?
they learn to map input onto categories through training examples
how do we build a classifier?
we train it on many training examples
supervised learning
each training example is an input paired with the correct category
what are neural networks similar to?
a probability distribution or model
when training a network inputs cause what to happen?
it generates activity in all its output units. As more activity in the outputs increases the chance of getting the correct output
How could you interpret the outputs of a neural network?
as a probability distribution as all possible outcomes can sum to 1
How does a network for language process work?
by taking a sequence of words as its input/ prompt and learning to predict the next word
what is self-supervised learning?
a network prepends it doesn’t know the next word and then checks if its guess was right.
what does a language processing network need to predict a good distribution?
good word representations that can handle long prompts
How should we represent words in a neural network?
One-Hot Encoding
A sparse vector where each word in the vocabulary is represented by a unique vector with a single high (1) value and all other values low (0).
“cat” 10000
“dog” 10001
Contextualized Word Embeddings
Word representations that take the surrounding context into account, providing different embeddings for the same word in different contexts.
what are embeddings?
distributed word representations that are ‘stuck’ into good places in an n-dimensional space.
how are LLM’s input word sequences processed?
by an encoder network
how are LLM’s output word sequences processed?
by a decoder network