Lecture 4 Flashcards
Grammar and Parsing
Syntactic Level Analysis
To analyze how words are put together
to make valid sentences
Grammar:
Grammar: the kind of implicit
knowledge of your native language that
you had mastered by the time you were
3 or 4 years old without explicit
instruction
Chomsky:
syntactic structure can be independent on the meaning of the sentence
Grammars (and parsing) are key
components in many applications:
- Grammar checkers
- Dialogue management
- Question answering
- Information extraction
- Machine translation
Two types of Grammars
*Context Free Grammar (CFG), also known
as Phrase Structure Grammar
* Dependency Grammar
Context Free Grammar (CFG)
a set of recursive rewriting rules (or productions) used to generate patterns of strings.
CFGs describe the structure of language by capturing
constituency and ordering
Constituency
How we group words into units and what we say about how the various kinds of units behave
Ordering
Rules that govern the ordering of words and bigger units in the language
Notations of CFG
Non-terminal
symbols represent the phrases, the categories of phrases, or the constituents,
e.g., NP, VP, etc.
Notations of CFG
Terminals
symbols are the words,
e.g., car. They
often come from words in a lexicon
Notations of CFG
Rewrite rules / productions
rules for replacing nonterminal symbols (on the left
side) with other nonterminal or terminal symbols (on the right side)
Notations of CFG
Start symbol:
a special nonterminal symbol that appears in the initial string generated by the grammar: S [NP VP] | VP
derivation
a sequence of rules applied to a string that accounts for that string
* Covers all the elements in the string
* Covers only the elements in the string
Parsing
is the process of finding a derivation (i. e. sequence of productions) leading from the START symbol to a
TERMINAL symbol (or TERMINALS to START symbol)
Challenges for CFG
Agreement
In English, subjects and verbs have to agree in person and number; Determiners and nouns have to agree in
number. S - NP VP
Challenges for CFG
Subcategorization
expresses the constraints that a particular verb (sometimes called the predicate) places on the number and
syntactic types of arguments it wants to take (occur with)
Dependency Grammar
- Dependency grammars offer a different
way to represent syntactic structure - CFGs represent constituents in a parse
tree that can derive the words of a
sentence - Dependency grammars represent
syntactic dependency relations between
words that show the syntactic structure - Syntactic structure is the set of relations
between a word (aka the head word) and
its dependents.
Parsing
the process of finding a derivation (i. e. sequence of productions) leading from the START symbol to a
TERMINAL symbol (or TERMINALS to START symbol)
Top down parser
- starting from the rules
- Only searches for trees that can be answers
- But also suggests trees that are not consistent
with any of the words
Bottom up parser
- starting from the input token list
- Only forms trees consistent with the words
- But suggest trees that make no sense globally
Solutions to parsing problems (1)
- Solve the problem of
performance with chart parsers that use a
special data structure (i.e., chart) to get rid of the backtracking
Solutions to parsing problems (2)
- Solve the problems of predefining CFG or other grammars by using Treebanks and statistical parsing. The main use of the Treebank is to provide the probabilities to inform the statistical parsers
Solutions to parsing problems (3)
Partially solve the problems of correctly choosing the best parse trees
by using lexicalization (information about words from the Treebank)
Probabilistic CFG (PCPG)
The parsing task is to generate the parse tree with the highest probability (or the top n parse trees)
Attach probabilities to grammar rules
The expansions
for a given non-terminal sum to 1
VP -> Verb .55
VP -> Verb NP .40
VP -> Verb NP PP .05
The probability of a parse tree:
the product of the
probabilities of the rules used in the derivation
Word Sense
the Meaning of a Word -
We say that a word has more than
one word sense (meaning) if there
is more than one definition.
Word senses may be
Coarse-grained, if not many distinctions are
made
Fine-grained, if there are many distinctions
of meanings
Polysemy:
a word with two or more
related meanings
Homonymy:
Words spelled (or
pronounced) the same way but
with different meanings
Hypernymy:
A more general term
that encompasses a word
Hyponymy:
A more specific term
that is contained within a word
How Humans Disambiguate
- local context (e.g., book in a sentence
that has flight, travel, etc.) - the sentence or other surrounding text
containing the ambiguous word
restricts the interpretation of the
ambiguous word - domain knowledge (e.g., plant in a
biology article) - the fact that a text is concerned with a
particular domain activates only the
sense appropriate to that domain - frequency data
- the frequency of each sense in general
usage
How Machines Disambiguate
Algorithm for simplified Lesk:
1. Retrieve from machine readable dictionary
all sense definitions of the word to be
disambiguated
2. Determine the overlap between each
sense definition and the current context
3. Choose the sense that leads to highest
overlap
Example: disambiguate
PINE
“Pine cones hanging in a tree”
* PINE
1. kinds of evergreen tree with needle-shaped leaves
2. waste away through sorrow or illness
WSD
Word Sense Disambiguation
Classifier Approach to WSD -1
Train a classification algorithm that
can label each (open-class) word with
the correct sense, given the context of
the word
Classifier Approach to WSD -2
Training set is the hand-labeled corpus
of senses
Classifier Approach to WSD -3
Result of training is a model that is
used by the classification algorithm to
label words in the test set, and
ultimately, in new text examples
Word Similarity Features:
- For each word in the context, compute
a similarity measure between that
word and the words in the definitions
to be disambiguated - Similarity measures can be defined
from a semantic relation lexicon, such
as WordNet (hypernym, hyponym)
Syntactic features (relationship between the word and the other parts of the sentence)
Predicate-argument relations: Verb-object,
subject-verb
Heads of Noun and Verb Phrases
Collocational features:
Information about words in specific positions (i.e., previous word)
Associated words features -1
For each word to be disambiguated, collect a small number of frequently-used
context words.
Associated words features -2
Represent these words as a set of words feature