Learning from Data: Text Flashcards
Explain the basic concept of Bag of Words representation. What are its main limitations when dealing with large vocabularies?
Bag of Words represents text as word counts. It struggles with large vocabularies due to sparse, high-dimensional vectors.
How would you represent the sentence ‘John likes to watch movies. Mary likes movies too’ using Bag of Words representations?
For the vocabulary {John, likes, to, watch, movies, Mary, too}
The vector is: [1, 2, 1, 1, 2, 1, 1].
How do word embeddings improve upon the Bag of Words approach? What advantages do they offer?
Word embeddings are dense low-dimensional vectors that capture word meanings and relationships, reducing dimensionality and showing semantic similarity.
Explain the concept of the hidden state in RNNs. What is its purpose?
The hidden state stores information about previous inputs, providing context for sequential data.
What is encoder-decoder architecture, and why is it important for NLP tasks?
An encoder-decoder uses one network to encode input into a vector and another to decode outputs, enabling tasks like translation or summarization.
What is the basic idea behind attention mechanisms in neural networks?
Attention focuses on the most relevant parts of the input sequence for better predictions.
How does the attention mechanism help address the limitations of traditional RNNs?
Attention allows the model to directly access specific input parts, reducing reliance on a single context vector.
Compare and contrast the numerical representations created by Bag of Words versus word embeddings.
Bag of Words is sparse and high-dimensional. Word embeddings are dense, low-dimensional, and capture semantic relationships.
How do word vectors represent semantic relationships between words?
Word vectors place similar words close in space, capturing relationships like ‘king - man + woman ≈ queen.’
Explain why feed-forward (dense) layers aren’t well-suited for processing text data.
Dense layers ignore sequence and context, losing critical word relationships.
The attention mechanisms allow models to ‘correctly associate certain words with other words in a sentence.’ Explain what this means and why it’s important.
Attention links related words, ensuring context-aware predictions, crucial for tasks like translation.
Explain how the vectors passed from the encoder to the decoders are created when using attention and when not using attention.
Without attention: A single context vector summarizes the input. With attention: Multiple weighted vectors highlight relevant parts dynamically.