lesson_12_flashcards

Question 1

Q

What is a language model?

Answer

A

A model that estimates the probability of sequences of words and enables applications like predictive typing, text completion, and speech recognition.

Question 2

Q

What is the chain rule of probability in language modeling?

Answer

A

A technique to compute the probability of a sequence by multiplying conditional probabilities of each word given its history.

Question 3

Q

What is perplexity in language modeling?

Answer

A

A metric that evaluates a language model’s performance by measuring how well it predicts a sample, with lower values indicating better performance.

Question 4

Q

What is teacher forcing in RNN training?

Answer

A

A training method where the actual next word from the dataset, not the model’s prediction, is used as the input at each time step.

Question 5

Q

What are recurrent neural networks (RNNs)?

Answer

A

A family of neural architectures for sequence modeling, processing inputs sequentially and maintaining a state vector to represent past inputs.

Question 6

Q

What is the vanishing gradient problem in RNNs?

Answer

A

Gradients become too small during backpropagation through time, making it difficult to learn long-term dependencies.

Question 7

Q

What are LSTMs and GRUs?

Answer

A

Variants of RNNs designed to address the vanishing gradient problem by incorporating gating mechanisms for better long-term memory retention.

Question 8

Q

What is masked language modeling?

Answer

A

A pretraining task where certain words in a sequence are masked and the model predicts them, improving performance on downstream NLP tasks.

Question 9

Q

What is cross-lingual transfer in masked language models?

Answer

A

The ability of a model trained on one language (e.g., English) to perform well on another language (e.g., French) without additional training.

Question 10

Q

What is knowledge distillation in NLP?

Answer

A

A technique where a smaller model (student) learns to replicate the predictions of a larger model (teacher), reducing computation costs while retaining accuracy.

Question 11

Q

What is conditional language modeling?

Answer

A

Language modeling conditioned on additional information (e.g., a topic, an image, or another language) for tasks like translation or image captioning.

Question 12

Q

What is the role of cross-entropy in language modeling?

Answer

A

A loss function used to measure the difference between the predicted probability distribution and the true distribution of sequences.

Question 13

Q

What are attention mechanisms in RNNs?

Answer

A

Mechanisms that allow models to focus on specific parts of a sequence dynamically, improving the representation of long-range dependencies.

Question 14

Q

What is sequence-to-sequence modeling?

Answer

A

Mapping an input sequence to an output sequence, used in tasks like machine translation, summarization, and speech recognition.

Question 15

Q

What is the importance of embeddings in language models?

Answer

A

Word embeddings represent words as dense vectors, capturing semantic relationships and improving the input representation for NLP tasks.

lesson_12_flashcards

(15 cards)