Topic 4: Grammar & Parsing Flashcards
What is syntax?
Refers to the way words are arranged together
Constituency
group of words behaving as a single unit or constituent
fundamentals of developing grammar
example
noun phrase, a sequence of words surrounding at least one noun
Evidence for constituency
words can all appear in a similar syntactic environments.
give example
noun phrase can occur before verbs
preposed or postposed constructions
example of prepositional phrase “ on September seventeenth” can be placed in ….
Context Free Grammar
formal system for modeling constituent structure in English and other natural languages
also known as phrase-structure grammars
consists of a set of rules or productions
the rule expresses ways symbol of the language grouped and ordered together and lexicon of words and symbols
Some rules for noun phrase
NP -> Det Nominal
NP -> ProperNoun
Nominal -> Noun | Nominal Noun
example derivation
NP -> Det Nom -> the Nom Noun -> the Noun Noun -> the morning noun -> the morning flight
2 Classes of symbols
terminal - corresponds to words in the language
non-terminal - express abstractions over the terminals
item on left of the arrow is a single non-terminal symbol.
right side is an ordered list of one or more terminals or non-terminals
Function of CFG
- ways for generating structures
- ways to assign structure to a given sentence.
Derivation
the sequence of rule expansions is called derivation of the string of words.
Parse Tree
derivation can be represented with parse tree
More rules on Eng CFG
Give examples
S -> NP VP
VP -> VERB NP
VP -> VERB PP
PP -> PREPOSITION NP
Sample Lexicon and Sample Grammar
examples …
Bracketed Notation
used to represent parse tree in a more compact format
give example
Grammatical Sentence
sentences that can be derived by a grammar defined by the grammar in the formal language.
Ungrammatical sentences
sentences that cannot be derived by a given formal grammar
Four parameters in CFG
non-terminal symbols
terminal symbols
rules
start symbol
Sentence-Level construction
Declarative - S -> NP VP
Imperative - S -> VP (VP and no subject)
Yes-no question - S -> AV NP VP
Wh-subject question - same with declarative except that noun phrase contains some wh-word S -> wh NP VP
wh-non-subject question - auxiliary appears before subject NP just as in the yes-no question S -> Wh-NP Aux NP VP
Noun Phrase
Role of determiner can be filled by a possesive expression
Det -> NP ‘s
before head noun
relative pronoun
prepositional phrase
Verb Phrase
consist of the verb and number of other constituents VP -> Verb VP -> Verb NP VP -> Verb NP PP VP -> Verb PP VP -> Verb S
Example possible constituents in VP
not every verb compatible with every VP
More VP Rules
VP -> verb with no complement
VP -> verb with NP complement NP
VP -> verb with S complement S
Coordination
Major phrases types discussed here can be cojoined with conjunctions to form larger construction of the same type.
example
NP -> NP and NP
VP -> VP and VP
S -> S and S
Tree bank
syntactically annotated corpus
important role in parsing
used by parsers to automatically parse each sentence followed by human corrected the parses.
sufficiently robust grammars consists of CFG rules can be used to assign parse tree to any sentence.
Constituency Parsing
syntactic parsing is a task of recognizing a sentence and assigning a syntactic structure to it
goal is to produce the correct tree
Importance of parse tree
used in grammatical checking.
used as intermedia representation for semantic analysis
Ambiguity
structural ambiguity occurs when grammar can assign more than one parse to a sentence
2 types of ambiguity
attachment ambiguity
- constituent can be attached to a parse tree at more than one place
coordination ambiguity - different set of phrases conjoined by conjunction example [old [ men and women ] ] [old men] and [women]
Syntactic disambiguation
to choose a single correct parse from multitude possible parses
algo requires statistical, semantic and contextual knowledge sources
Statistical parsing
compute probability of each interpretation and choose most probable
Probabilistic CFG
assigned rule based on probability
also known as stochastic context free grammar
used for disambiguation
PCFG differs from standard CFG by augmenting each rule in R with a conditional probability
A -> b [p]
P ( A -> b | A )
sum of all possible expansions of non-terminal probabilities must be one
exercise PCFG
give probabilities of rules
give multiple parse tree
compute the probabilities of rules from the parse trees and select the PCFG with highest probability
Dependency Parsing
Phrasal constituents and phrase-structure rules do not play a direct role
Dependency Style Analysis
relations among words are illustrated with directed, labelled arcs from head to dependents.
known as typed dependency structure because labels are drowned from a fixed inventory of grammatical relations
Advantages of dependency grammars
ability to deal with languages that are morphologically rich and have relatively free word order
the approach abstracts away from word-order information
Core relations
clausal relations - syntactic roles with respect to a predicate
modifier relations - categorize ways the word can modify heads
give examples. . .