POS-Parsing 5 Flashcards
What are the 8 POS tags in grammar school?
noun, verb, adjective, adverb, preposition, conjunction, pronoun, interjection
The collection of POS tags used is called?
tagset
What is a tagset?
A tagset contains all part of speech tags used for a specific corpus and what the tags mean (e.g., VBD = verb in past tense)
• Tags are usually uppercase (DT, ADJ, VBD)
• Similar tags often share a prefix (e.g., V… = related to verbs)
• Tagsets are language-specific and corpus-specific
(e.g., Social media corpora have a tag for emotions)
Name two tagsets.
Penn Treebank Tagset and Universal Tagset
Name three difficulties of POS tagging?
A word can have multiple POS tags
Most of them are common words
Can be difficult even for experienced human labellers
What are homonyms?
Two distinct words that have the same spelling are called homonyms.
Sentences that can be derived by a grammar are in the formal language defined by that grammar, and are called?
grammatical sentences
Sentences that cannot be derived by a given formal grammar are not in the language defined by that grammar and are referred to as?
ungrammatical
In linguistics, the use of formal languages to model natural languages
is called?
generative grammar
since the language is defined by the set of possible sentences “generated” by the grammar
What is syntactic parsing?
the task of recognizing a sentence and assigning a syntactic structure to it.
Name three types of parsing?
Constituency Parsing
Dependency Parsing
Syntactic Parsing
What is constituency parsing?
Constituency parsing aims to extract a constituency-based parse tree
from a sentence that represents its syntactic structure according to a
phrase structure grammar
What is dependency parsing?
Dependency grammars focuses on how words relate to other words
Dependency is a binary relation between a head (or: governor) and its dependents.
• The head of a sentence is usually the finite verb.
• Every other word in the sentence depends on it either directly or through a
path of dependencies
Caveat: there are multiple theories for dependency parsing that may yield different results!
Why is syntactic parsing important?
Give 2 reasons.
• Grammar checking
• Understand the subject/main verb/object of a sentence; useful in
downstream tasks, e.g. question answering, information extraction
What is Chunking?
Chunking is a process of extracting phrases from unstructured text.
• E.g. Instead of just extracting simple tokens which may not represent the
actual meaning of the text, it is advisable to use phrases such as
“South Africa” as a single word instead of ‘South’ and ‘Africa’
separate words.