#1 Flashcards
Computational Linguistics
the filed concerned with using automatic computational methods to analyze and synthesize natural languages (text,speech,gesture)
Phonology
the study of sounds of a language
ex.
pat vs bat
distinction between words “p” and “b”
corpus
a computer-readable collection of linguistic production
plural form of corpus
corpora!
key factor of corpora
representativeness
kinds of corpora
balanced/imbalanced
written/spoken
lexicon
set of all words and phrases in a language
ex. “dog”, “cat”
“I’m going to the gym”
synsets
a tree structure (WordNet)
organized in near-synonym sets
resnik similarity
two words are more similar the higher the info content of their lowest common subsume
ex. finch, penguin –> bird
lesk similarity
overlap of glosses, two words are more similar the higher the lexical overlap
ex. Cat: feline mammal usually having thick
Lion: large gregarious predatory feline
Cat and Lion have higher semantic similarity, share the word feline
characteristic of natural images
ambiguous
valid hyponym of Dog
shpiz
hypernym of Melon
Fruit
compositional semantics
studies how the meanings of words and phrases are combined
pragmatics
analyze how context influences meaning