lecture 10 Flashcards

1
Q

predicate

A

usually the verb or verb phrase that expresses the action or state
(in dictionary form)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

thematic role: agent

A

volitional causer of an event

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

thematic role: experiencer

A

experiencer of an event

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

thematic role: force

A

non-volitional causer of the event

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

thematic role: theme

A

participant most directly affected by the event

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

thematic role: result

A

the end product of an event

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

thematic role: content

A

the proposition or content of a propositional event

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

thematic role: instrument

A

an instrument used in an event

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

thematic role: beneficiary

A

the beneficiary of an event

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

thematic role: source

A

the origin of the object of a transfer event

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

thematic role: goal

A

the destination of an object or transfer event

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

idiom

A

expressions whose meanings are not predictable from the meanings of their individual words

  • noncompositional
  • means they usually cannot be translated word-for-word into another language
  • highlights how literal translations fail to capture the intended meaning, emphasizing the importance of understanding cultural and contextual nuances for accurate idiomatic translation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

IBM models 1-5

A
  • series of word-based statistical models that are induced from parallel data (alignment probability distributions)
  • data-driven
  • laid groundwork for modern statistical machine translation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

phrase-based statistical machine translation (SMT)

A
  • unlike word based models that translate words in isolation, phrase-based SMT considers contiguous sequences of words/phrases
  • improved translation significantly over earlier word-based models
  • handle phrases and idioms better, capture linguistic context better
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

neural machine translation

A
  • quickly becomes state-of-the-art
  • relies on deep learning models, specifically neural networks, to perform translations
  • encoder-decoder architecture
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

central problem of machine translation

A

language divergence: structural differences in word order between languages

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

why is machine translation difficult

A
  1. ambiguity
    –> same word can have multiple meanings
    –> same meaning can be described by multiple word(forms)
  2. word order
    –> underlying deeper syntactic structure
    –> computationally intensive
  3. morphological richness
    –> Identifying basic units of words (morphemes)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

correspondences

A
  1. one-to-one: simple sentence translation maintaining word order and meaning
  2. one-to-many (and reordering): single words in one language may require multiple words in another, and may need reordeing
  3. many-to-one (and elision): multiple words in one language combine to form a single word in another
  4. many-to-many: entire phrases or idiomatic expressions may need to be translated into completely different phrases in another language
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

lexical divergences: lexical specificity

A

a word in one language has multiple specific translations in another language
–> brother = gege (older) or didi (younger)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

lexical divergences: homonyms and polysemous words

A

the different senses of homonymous words generally have different translations
–> (river) bank = ufer
–> (money) bank = bank

the different senses of polysemous word may also have different translations
–> i know that he bought the book, i know peter, i know math
–> sais qu, connais, m’y connais en

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

lexical divergences: morphological differences

A

different languages exhibit varied inflections and morpheme structures

–> new = nouveau/nouvelle

19
Q

lexical divergences

A
  1. homonymous words
  2. polysemous words
  3. lexical specificity
  4. morphological divergences
19
Q

syntactic divergences

A
  1. word order
  2. head-marking vs dependent-marking
  3. pro-drop languages
  4. negation
20
Q

syntactic divergences: word order

A
  • word order can be fixed or free
  • languages with a fixed word order have sentences that follow a specific structure (e.g., SVO)
21
Q

syntactic divergences: head-marking vs dependent-marking

A
  • head-marking languages: grammatical relationships are indicated on the head of a phrase
    –> the man house-his
  • dependent-marking languages
    –> the man’s house
22
Q

syntactic divergences: pro-drop languages

A
  • these languages can omit pronouns

–> e.g., spanish: i eat = como

23
Q

syntactic divergences: negation

A

negation operates differently across languages

24
Q

semantic differences

A
  1. aspect
  2. motion events
25
Q

semantic differences: aspect

A

conveying current actions

  • progressive aspect: swimming
  • expression with an adverb: schwimmt gerade
26
Q

semantic differences: motion events

A

have two properties
1. manner of motion (swimming)

  1. direction of motion (across the lake)

languages either express the manner with a verb and the direction with a ‘satellite’ or vice versa

27
Q

why model translation with a probabilistic model

A
  1. we would like to have a measure of confidence for the translations we learn
  2. we would like to model uncertainty in translation
28
Q

model

A

a simplified and idealized understanding of a physical process

29
Q

translation explained with the Noisy Channel Model

A
  • general framework for many NLP problems
  1. generate target sentence
  2. a channel corrupts the target
  3. source sentence is a corrpution of the target sentence

–> translation is then the process of recovering the original signal (e) given the corrputed signal (f)
–> P(e|f) = p(e) * P(f|e)

30
Q

why use the noisy channel model

A
  1. makes it easier to mathematically represent translation and learn probabilities
  2. fidelity (accuracy of content) and fluency (naturalness of language) can be modeled separately
31
Q

word alignment

A

task to learn sentence translation probabilities where we first need to learn word-level translation probabilities

  1. start with parallel sentence pair
    –> a sentence in one language paired with its translation in another language
  2. since there are multiple possible alignments, we try to find multiple sentence pairs
    –> multiple possible word alignments
  3. key idea: look at the co-occurrence of translated words. words that occur together in the parallel sentence are likely to be translations
  4. calculate P(f|e)
    –> probability of a word in language 1 (f) given another word (e)
32
Q

problem with word alignment

A

we can only find the best alignment if we know the word translation probabilities
–> this is a chicken and egg problem

33
Q

solution to word alignment problem

A

iterative process: Expectation-Maximization (EM) algorithm

  1. estimate alignment probabilities using word translation probabilities
  2. re-estimate word translation probabilities
  • since we dont know the best alignment initially, we consider all possible alignments when estimating the word translation probabilities and weigh all these with their corresponding alignment probabilities
  • computed as the ratio of the expected number of times the pair (f, e) occurs to the expected number of times any word pairs with e.
34
Q

phrase-based SMT

A

use phrases (sequence of words) as the basic translation unit

–> Instead of aligning single words between the source and target languages, we align entire phrases.

35
Q

benefits of phrase-based SMT

A
  1. local reordering: B-SMT allows intra-phrase re-ordering, meaning that within a single phrase or sequence of words, the order can be adjusted and memorized to better match the target language’s structure
    –> the ordering of words is adapted to fit the syntactic rules of the other language
  2. sense disambiguation: PB-SMT uses the context provided by neighboring words within a phrase to disambiguate meaning.
  3. handling institutionalized expressions: idioms can be learned as a single unit
  4. improved fluency: incorporating entire phrases, which can be of any length, enhances the natural flow of translations.
36
Q

learning the phrase translation model

A
  1. learn the phrase table (central data structure in PB-SMT)
  2. learn the phrase translation probabilities
37
Q

SMT pipeline

A
  1. word alignment
  2. phrase extraction + distortion modelling + feature extraction + language modeling
  3. tuning
  4. decoder
38
Q

getting word order right in PB-SMT

A

preprocessing the input by changing the order of words in the input sentence to match the order of the words in the target language

  1. parse the sentence to understand its syntactic structure
  2. apply rules to transform the tree
39
Q

addressing rich morphology

A
  1. break the words into its component morphemes
  2. learn translations for the morphemes
40
Q

transliteration

A

handling names and OOVs (out of vocabulary words)

41
Q

evaluation of MT output

A

with respect to
1. adequacy: how good the output is in terms of preserving content of the source text

  1. fluency: how good the output is as a well-formed target language

types:
1. human evaluation
2. automatic evaluation
3. BLEU: compares ngrams between languages
4. TER: TER: measures number of edits required
5. METEOR

42
Q

precision

A

words in candidate that are in ref / #words in candidate

(with repetition)

43
Q

modified precision

A

words in candidate that are in ref / #words in candidate

clip the number of matching words to their max count in the reference sentence

44
Q

recall

A

cannot be used for PB-SMT

45
Q

greedy decoding

A

selects the word with the highest probability

–> risks running into local optima

46
Q

sampling decoding

A

randomly selecting the next word based on the probability distribution

–> introduces randomness, potentially capturing more diverse translations but at the risk of inconsitencies

47
Q
A