Week 4 - POS Tagging Flashcards
POS
Parts of Speech
A linguistic category of words, which is generally defined by their syntactic or morphological behaviour
Word class, lexical class & lexical category refer to the same thing
Explains not what the word is, but how it is used
Typical grammar classifications
verb - Open
noun - Open
adjective - Open
adverb - Open
interjection -Open
pronoun - Closed
preposition - Closed
conjunction - Closed
Open Word classes
Classes that constantly acquire new words
nouns: Internet, Blog, Covid
verbs: To google, To tweet, To self-isolate
Closed word classes
Classes that generally do not acquire new members
prepositions: to, from, in
pronouns: I, you, he/she/it, we, them
Content Words
Words that carry the meaning of a sentence
Function words (grammatical words)
Words that have little lexical meaning, but instead serve to express grammatical relationships with other words within a sentence or the speaker’s mood
Articles (the) or conjunctions (and) can be found in almost any utterance, no matter what they are about
POS Tagging
assigning a part of speech to each word in a corpus
Main issue with POS tagging
Same word (form) can have different POS tags depending on the context
Internal cues for POS tagging
morphology (on particular affixes) can be used to guess POS of unknown words
bootylicious -> -ous is (often) used for adjectives
ODed -> -ed is common for verbs (past)
External cues for POS tagging
POS tags are predicted using (limited amount of) context
It’s more likely to see a verb after token “will”