lecture 2 Flashcards

Question

jabberwocky sentences

Answer 1

testing whether language models can **understand and represent syntactic structures**, even with nonsensical or novel words to see if a model's latent space encodes structural information

Answer 2

word order determines what a **subject** and what an **object** is a model's **latent space** can represent syntactic roles based on the positions of words in the space LLMs often dont care about word order, which affects its ability to grasp syntactic roles

Answer 3

task: semantic interpretation testing how models handle **novel words within familiar syntactic structures** --> i.e., whether models can generalize learned structures COGS measure generalization is hard for seq2seq models, not as hard for models with structure built in --> structure-aware models are therefore better

Answer 4

tests models on their ability to generalize compositions of words and structures, by ensuring words/structures are different between the training and test set

Answer 5

1. ambiguity 2. sparse data due to zipf's law 3. variation 4. expressivity 5. context dependence 6. unknown representation 7. spoken & grounded

Answer 6

language has ambiguity at many levels --> word senses, POS, syntactic structure, quantifier scope, multiple meanings solution: 1. non-probabilistic methods: return all possible analyses 2. probabilistic models: return the best possible analyses --> 'best' is only good if our probabilities are accurate

Answer 7

1. typically more robust than earlier rule-based methods 2. relevant statistics/probabilities are learned from data 3. normally requires lots of data about any **particular phenomenon**

Answer 8

- rank-frequency distribution is an inverse relation - f * r = k - a small number of words are very common, while the **majority of words is rare**, making it difficult for models to learn effectively from limited instances

Answer 9

- POS tags are trained on formal language, which makes it hard to use on informal language such as seen on social media - different contexts, vocabulary, grammatical structures, and varieties in language can reduce the tagger's effectiveness

Answer 10

- one form can have different meanings --> e.g., bank - the same meaning can have different forms --> e.g., 'some kids popped by' vs 'some children visited'

Answer 11

correct interpretation is context-dependent and often requires world knowledge --> e.g., interpretation of 'he dropped the ball'

Answer 12

the meaning of words contributes to the overall structure and coherence of language

Answer 13

1. input with rich semantic embedding 2. add positions to embeddings 3. masked self-attention 4. feed-forwards 5. linear transformations 6. softmax 7. probabilities

Answer 14

different languages handle syntactic representation of objects within a sentence in different ways understanding differential object marking is crucial for NLP systems to accurately process and generate language representations across diverse languages LMs are aware of these gradations, but **animacy** influences this grammatical distinction --> animate entities are more likely to be subjects (agents)

Answer 15

**making less plausible situations more explicit** is a common feature of grammatical structure

Answer 16

1. idioms and metaphors: meaning cannot be directly inferred from the words themselves 2. we're constantly using constructions that we couldn't get from just a syntactic + semantic parse i.e., language **understanding requires more than just analyzing individual words** and their immediate syntax

Answer 17

**high dimensional spaces** are much better at capturing specific subtleties than any rules we could come up with

lecture 2 Flashcards

(41 cards)