Hoorcollege 9 Translating ||: Neural methods Flashcards

1
Q

Attention

A

Attention gives weights to each context word before putting into vector
* Fixed content without attention puts all the words of context in one vector

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

RNN encoder

A
  • Often bidirectional, which can be contextualize bettter

Transfomers:
* More efficient computing T(x1, x,) in one step via attention

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Translating with transformers

A

Encoder uses transformer block and decoder more powerful blocks with an extra encoder-decoder attention layer which is undirectional via MASK
* Row of self-attention is normalized by softmax

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Neural generation

A

Needs as many output vector D’s as words and highest score gets looked up

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Out of vocabulary

A

Approaches:
* <unk>
* copying, because it is probably a name
* Subword segmentation, such as BPE</unk>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

BPE (Byte pair encoding)

A

We try to find subwords using BPE and translate them

BPE finds most common letters co-occurring in a corpus and then treats those as 1 item
and goes on

OPUS: massive open sources parallel (meaning in multiple languages for 1 text) texts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Sentence aligment

A

finding correspondence between source sentences and their
equivalent translations in the target text

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Backtranslation

A

If there is a small corpus for language x to y then train a model on it and
expand corpus with y to x translations and train it again

  • You can align word vector spaces of the two languages and translate it word by word and
    use thee translations to train the model as well
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

MT evaluation

A
  • precision: % of ngrams of candidate translation in the reference &
  • recall= % of ngrams of reference are in candidate translation

chrFB = (1 + B^2) * ((chrP * chrR) / (B^2 * chrP + chrR))
calculates the P of word sequences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly