NLP Flashcards

1
Q

What is vector representation in NLP?

A

A method to represent words or phrases as numerical vectors in a continuous vector space.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the primary goal of the CBOW model?

A

To predict a target word based on its context words.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Fill in the blank: GloVe stands for __________.

A

Global Vectors for Word Representation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How does GloVe differ from Word2Vec?

A

GloVe uses global word co-occurrence statistics while Word2Vec focuses on local context.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the main advantage of FastText over Word2Vec?

A

FastText represents words as bags of character n-grams, allowing it to capture subword information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is Word2Vec primarily used for?

A

To create word embeddings that capture semantic meanings based on context.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

True or False: Matrix factorization is a technique used in dimensionality reduction.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the two main architectures of Word2Vec?

A

Skip-gram and Continuous Bag of Words (CBOW).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Multiple Choice: Which model is designed to capture the meaning of out-of-vocabulary words?

A

FastText

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What kind of data does GloVe use to create word vectors?

A

Global word co-occurrence matrices.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Fill in the blank: In Word2Vec, the Skip-gram model predicts __________ from a given word.

A

context words

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

True or False: FastText can generate embeddings for words not seen in the training data.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the purpose of dimensionality reduction in NLP?

A

To reduce the number of features while preserving important information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How does matrix factorization work in the context of NLP?

A

It decomposes a large matrix into products of smaller matrices to uncover latent patterns.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a key benefit of using vector representations of words?

A

They allow for mathematical operations that reveal semantic relationships.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Multiple Choice: Which of the following is NOT a method for word vector representation?

A

Term Frequency-Inverse Document Frequency (TF-IDF)

17
Q

What is the output dimension in a Word2Vec model determined by?

A

The size of the word embeddings specified during model training.

18
Q

True or False: The primary goal of NLP vector representations is to replace traditional text processing methods.