Lecture 1 Flashcards
norbert wiener:
what distinguished human communication from the communication of most other animals
- the delicacy and complexity of the code used
- the high degree of arbitrariness of this code
syntax
rules and principles that govern the structure of sentences in a language
part of speech (POS)
labels that indicate the grammatical category of a word
morphology
form and structure of words
- how words are created, how they can change form, and how these forms contribute to their meaning
- ‘is’ is a form of ‘be’ in the 3rd person singular, present tense
semantics
meaning of words and the sentence as a whole
discourse
considers the sentence within the context of larger communication
natural language processing
definition + process
giving computers the ability to understand text and spoken words in much the same way humans can
- representing input
- generating output
- computational modeling
NLP: representing input
represent language in a way that a computer can process it
e.g., tokenization, stemming, lemmatization
NLP: generating output
process language in a way that is useful for humans
NLP: computational modeling
understanding language structure and language use
gap of 4-5 orders of magnitude between the amount of inputs LLMs receive compared to humans
LLMs are trained on datasets that are much larger than the amount of language input humans receive over a similar timespan
different scenarios for understanding and developing human-scale and LLM-scale intelligence
- innate (nativist) - humans
- grounding (multimodal) - both
- active/social (constructivist) - LLMs
- evaluation (humans and LLMs are fundamentally misaligned) - LLMs
stochastic parrot
The term “stochastic parrot” is used metaphorically to describe a model that generates text by probabilistically selecting words based on learned patterns, much like how a parrot might repeat phrases without understanding their meaning.
In NLP, probabilistic models use statistical methods to predict the likelihood of word sequences. They rely on large amounts of data to learn these probabilities and generate coherent text.