Learning from Data: Text Flashcards

Question 1

Q

Explain the basic concept of Bag of Words representation. What are its main limitations when dealing with large vocabularies?

Answer

A

Bag of Words represents text as word counts. It struggles with large vocabularies due to sparse, high-dimensional vectors.

Question 2

Q

How would you represent the sentence ‘John likes to watch movies. Mary likes movies too’ using Bag of Words representations?

Answer

A

For the vocabulary {John, likes, to, watch, movies, Mary, too}
The vector is: [1, 2, 1, 1, 2, 1, 1].

Question 3

Q

How do word embeddings improve upon the Bag of Words approach? What advantages do they offer?

Answer

A

Word embeddings are dense low-dimensional vectors that capture word meanings and relationships, reducing dimensionality and showing semantic similarity.

Question 4

Q

Explain the concept of the hidden state in RNNs. What is its purpose?

Answer

A

The hidden state stores information about previous inputs, providing context for sequential data.

Question 5

Q

What is encoder-decoder architecture, and why is it important for NLP tasks?

Answer

A

An encoder-decoder uses one network to encode input into a vector and another to decode outputs, enabling tasks like translation or summarization.

Question 6

Q

What is the basic idea behind attention mechanisms in neural networks?

Answer

A

Attention focuses on the most relevant parts of the input sequence for better predictions.

Question 7

Q

How does the attention mechanism help address the limitations of traditional RNNs?

Answer

A

Attention allows the model to directly access specific input parts, reducing reliance on a single context vector.

Question 8

Q

Compare and contrast the numerical representations created by Bag of Words versus word embeddings.

Answer

A

Bag of Words is sparse and high-dimensional. Word embeddings are dense, low-dimensional, and capture semantic relationships.

Question 9

Q

How do word vectors represent semantic relationships between words?

Answer

A

Word vectors place similar words close in space, capturing relationships like ‘king - man + woman ≈ queen.’

Question 10

Q

Explain why feed-forward (dense) layers aren’t well-suited for processing text data.

Answer

A

Dense layers ignore sequence and context, losing critical word relationships.

Question 11

Q

The attention mechanisms allow models to ‘correctly associate certain words with other words in a sentence.’ Explain what this means and why it’s important.

Answer

A

Attention links related words, ensuring context-aware predictions, crucial for tasks like translation.

Question 12

Q

Explain how the vectors passed from the encoder to the decoders are created when using attention and when not using attention.

Answer

A

Without attention: A single context vector summarizes the input. With attention: Multiple weighted vectors highlight relevant parts dynamically.

Learning from Data: Text Flashcards

(12 cards)