Lecture 4 Flashcards
What is Syntax in NLP?
Syntax is the set of rules that dictates the structure of sentences in a language, determining how words and phrases are organized.
Define a Morpheme.
A morpheme is the smallest unit of language with meaning, which cannot necessarily stand alone, like prefixes and suffixes.
What is a Lexeme?
A lexeme is an abstract unit of meaning representing a group of words with the same root and meaning, like “run,” “running,” “ran.”
What is the difference between Stemming and Lemmatization?
Stemming reduces a word to its stem form (e.g., “running” to “run”), while lemmatization reduces it to the dictionary form (lemma), taking context into account.
What are Compound Words?
Compound words are formed by combining two or more roots, such as “blackbird” (closed compound) or “credit card” (open compound).
Name three Lexical Categories.
Nouns, Verbs, and Adjectives are three major lexical categories or word classes.
What is Part-of-Speech (POS) Tagging?
POS tagging assigns a lexical category to each word in a sentence, helping identify the role of each word (e.g., noun, verb).
Why is POS Tagging challenging?
POS tagging is challenging due to ambiguity, as some words can serve different roles (e.g., “run” as a verb or noun), requiring context to determine the correct tag.
What is a Constituent in syntax?
A constituent is a group of words that function together as a single unit within a hierarchical sentence structure, such as a noun phrase or verb phrase.
Name the five types of Phrases in syntax.
Noun Phrase (NP), Verb Phrase (VP), Adjective Phrase (AdjP), Adverb Phrase (AdvP), and Prepositional Phrase (PP).
What are Phrase Structure Rules?
Phrase structure rules specify how phrases combine to form larger structures, like S → NP VP for sentence structure.
Describe a Syntax Tree.
A syntax tree visually represents the hierarchical structure of a sentence, showing how words and phrases relate to each other.
What is Dependency Parsing?
Dependency parsing analyzes the syntactic structure of a sentence by identifying relationships between words, represented as directed arcs.
What is a Context-Free Grammar (CFG)?
A CFG is a set of recursive rules used to generate sentences, where each rule expands a symbol regardless of its context.
List three types of verbs based on Grammar Rules.
Intransitive (no object), Transitive (one object), and Ditransitive (two objects).
What is Construction Grammar?
Construction Grammar views syntax and the lexicon as interconnected, treating both words and phrases as carriers of meaning without strict hierarchical structure.
How do Language Models handle word sequences?
Language models assign probabilities to sequences, predicting likely words based on context and training data.
Define Entropy in the context of language models.
Entropy measures the average amount of information or uncertainty in language, quantifying how much information each word conveys.
What is Perplexity in evaluating language models?
Perplexity measures a model’s uncertainty in predicting a sequence; lower perplexity indicates better predictive accuracy.
What is Structural Priming in language models?
Structural priming tests if exposure to specific syntactic structures affects a model’s predictions, indicating learned structural patterns.