lecture 10 Flashcards
predicate
usually the verb or verb phrase that expresses the action or state
(in dictionary form)
thematic role: agent
volitional causer of an event
thematic role: experiencer
experiencer of an event
thematic role: force
non-volitional causer of the event
thematic role: theme
participant most directly affected by the event
thematic role: result
the end product of an event
thematic role: content
the proposition or content of a propositional event
thematic role: instrument
an instrument used in an event
thematic role: beneficiary
the beneficiary of an event
thematic role: source
the origin of the object of a transfer event
thematic role: goal
the destination of an object or transfer event
idiom
expressions whose meanings are not predictable from the meanings of their individual words
- noncompositional
- means they usually cannot be translated word-for-word into another language
- highlights how literal translations fail to capture the intended meaning, emphasizing the importance of understanding cultural and contextual nuances for accurate idiomatic translation
IBM models 1-5
- series of word-based statistical models that are induced from parallel data (alignment probability distributions)
- data-driven
- laid groundwork for modern statistical machine translation
phrase-based statistical machine translation (SMT)
- unlike word based models that translate words in isolation, phrase-based SMT considers contiguous sequences of words/phrases
- improved translation significantly over earlier word-based models
- handle phrases and idioms better, capture linguistic context better
neural machine translation
- quickly becomes state-of-the-art
- relies on deep learning models, specifically neural networks, to perform translations
- encoder-decoder architecture
central problem of machine translation
language divergence: structural differences in word order between languages
why is machine translation difficult
- ambiguity
–> same word can have multiple meanings
–> same meaning can be described by multiple word(forms) - word order
–> underlying deeper syntactic structure
–> computationally intensive - morphological richness
–> Identifying basic units of words (morphemes)
correspondences
- one-to-one: simple sentence translation maintaining word order and meaning
- one-to-many (and reordering): single words in one language may require multiple words in another, and may need reordeing
- many-to-one (and elision): multiple words in one language combine to form a single word in another
- many-to-many: entire phrases or idiomatic expressions may need to be translated into completely different phrases in another language
lexical divergences: lexical specificity
a word in one language has multiple specific translations in another language
–> brother = gege (older) or didi (younger)
lexical divergences: homonyms and polysemous words
the different senses of homonymous words generally have different translations
–> (river) bank = ufer
–> (money) bank = bank
the different senses of polysemous word may also have different translations
–> i know that he bought the book, i know peter, i know math
–> sais qu, connais, m’y connais en