W4 L1 syntactic analysis Flashcards
how do PoS act as equivalence classes for words
we can think of PoS as equivalence classes for words
nouns can be swapped/interchanged without having the meaning/grammar/form of the sentence get changed/ruined
heres an example of a group of words that represent an equivanlecne class
he =
einstien =
albert einstein =
the scientist =
the famous scientist =
the famous scientist albert einstien
what are noun phrases
groups of words that contain a noun and the surronding words that support the noun further
what is a constituent
each noun phrase forms a constituent
what are the head and dependents in a constituent
head is the noun
the surrounding support words are the dependents
what is syntactic constituency
syntactic constituency is the idea that group of words can behave as a single unit/ constituents
three parties from brooklyn arrive….
the girls from mama mia love….
what is syntactic constituency used for
syntactic constituency is used to develop grammar/ grammar structures
what does it mean that constituents behave similary
they can be swapped around to create grammatically correct sentences
but the meaning of the sentence will change
what is a subject of the sentence
the main pariticipant of the action
ex albert, he, the bus
person place thing
what is the object
the participant to whom/which the action is applied
ie the nobel prize team chose albert (albert is the object)
what is an indirect object
the participant of the action that is affected by it indirectly
the team gave the prize to albert
(the prize is the indirect object)
can we look for constituants recursively
yes if we start with the full sentence and then start breaking down the sentences into consitiuents and then break those down further then we can make a tree of consitutents
how are delcaritive sentences such as i read books normally formed
they are formed using noun and verb phrases
what are context free grammars
mathmateical systems for modeling constituent (noun phrase) structure in language
if we can apply rules like
a setence -> noun phrase and verb phrase
then we can use math systems to figure out grammar without context
what are rules/productions in context free grammars (CFGs)
cfgs consist of a set of rules/productions
these rules expres the ways that symbols of the language(ie pos) can be grouped together
what are lexicons in context free grammars
CFGs have lexicons of words + symbols
there are two types of symbols
terminals: words that start or end a sentence ie the, Einstein
non-terminals, correspond to the others, ie NPs, Det, Etc
what are tree banks
used to study common structures for context free grammar
we can parse through the trees
what are some issues with context free grammar
universitality
language equality
ambiguity
what is cgf universiality problem
Does the grammar generate every possible string that can be made using its alphabet (the set of terminal symbols)?
what is cgf language equality problem
do two differnt cfgs generate the same output
problem unsolved rn
what is the cfg limit of parsing
ambiguity - multiple valid parses can exist
parsing vs chunking
parsing is concerned with taking an input and producing full linguistic structure for it
-> partial/shallow parses may be alright for suffcient for certain tasks
chunking is the process of idnetifying and classifying non overlapping sentences segments that make up the basic non-recusrive phrase that matches with the major PoS ie noun phrases verb phrases, PPs
what are the two different ways to parse sentence trees
bottom up, top down
what is bottom up parsing, when is it succesfully complete
start from the bottom and assign pos tags to the words
continue cobining non terminal words into further consitutents
the parse is successful when S (the full sentece) is reached and all consitutants are absorbed
what are the pros and cons with bottom up parsing
advantage: only need to consider consituents that are compaitble with the input
con: need to track all possible rules and sub trees even if they dont result in S
what is top down parsing, when is it succcesfulyl complete
start from top of the tree and go down,
its sucessful when all the terminal smbols/words are reached and all of them are covered by the parse
what are the pros and cons of top down parsing
advanatage: only considers rules that are compatible with a well-formed sentence rooted in S
cons: may consider many rules along the way that are not compatiable with the input
practice the top down early parsing algorithm
yuh
what is the stoping creiteria for the top down earley parsing algorithm
S is fully processed
or there is no imput left
what is dependency parsingq
dependency parsing is parising that focusing diretly with the input and establishing of directed binary grammartical relations that hold amoung the words of the input
what are the things you need to keep track of in earley parsing
step id, rule, start-end, history, word 1d