lesson_12_flashcards

1
Q

What is a language model?

A

A model that estimates the probability of sequences of words and enables applications like predictive typing, text completion, and speech recognition.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the chain rule of probability in language modeling?

A

A technique to compute the probability of a sequence by multiplying conditional probabilities of each word given its history.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is perplexity in language modeling?

A

A metric that evaluates a language model’s performance by measuring how well it predicts a sample, with lower values indicating better performance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is teacher forcing in RNN training?

A

A training method where the actual next word from the dataset, not the model’s prediction, is used as the input at each time step.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are recurrent neural networks (RNNs)?

A

A family of neural architectures for sequence modeling, processing inputs sequentially and maintaining a state vector to represent past inputs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the vanishing gradient problem in RNNs?

A

Gradients become too small during backpropagation through time, making it difficult to learn long-term dependencies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are LSTMs and GRUs?

A

Variants of RNNs designed to address the vanishing gradient problem by incorporating gating mechanisms for better long-term memory retention.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is masked language modeling?

A

A pretraining task where certain words in a sequence are masked and the model predicts them, improving performance on downstream NLP tasks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is cross-lingual transfer in masked language models?

A

The ability of a model trained on one language (e.g., English) to perform well on another language (e.g., French) without additional training.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is knowledge distillation in NLP?

A

A technique where a smaller model (student) learns to replicate the predictions of a larger model (teacher), reducing computation costs while retaining accuracy.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is conditional language modeling?

A

Language modeling conditioned on additional information (e.g., a topic, an image, or another language) for tasks like translation or image captioning.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the role of cross-entropy in language modeling?

A

A loss function used to measure the difference between the predicted probability distribution and the true distribution of sequences.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are attention mechanisms in RNNs?

A

Mechanisms that allow models to focus on specific parts of a sequence dynamically, improving the representation of long-range dependencies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is sequence-to-sequence modeling?

A

Mapping an input sequence to an output sequence, used in tasks like machine translation, summarization, and speech recognition.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the importance of embeddings in language models?

A

Word embeddings represent words as dense vectors, capturing semantic relationships and improving the input representation for NLP tasks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly