Lecture 5 Flashcards

1
Q

What is Semantics in NLP?

A

Semantics is the study of meaning in language, focusing on understanding references or truth within text and speech.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why are Word Representations important in NLP?

A

Word representations allow us to find documents with similar meaning rather than exact word matches, improving search and retrieval.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does “You shall know a word by the company it keeps” mean?

A

It implies that words used in similar contexts tend to have similar meanings, a foundation for distributional semantics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are Word Embeddings?

A

Word embeddings are vector representations of words, capturing semantic similarity by placing similar words close in vector space.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Describe Word2Vec and its two main methods.

A

Word2Vec creates word embeddings through Skip-gram (predicting context from target word) and CBOW (predicting target word from context).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a limitation of Word2Vec?

A

Word2Vec generates a single representation per word, which doesn’t account for polysemy (words with multiple meanings).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is Cosine Similarity in the context of word embeddings?

A

Cosine similarity measures the similarity between two word vectors as the normalized dot product of those vectors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are Contextualized Embeddings?

A

Contextualized embeddings, like those from BERT, create word representations that vary depending on the word’s context within a sentence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Name two examples of Contextualized Embedding Models.

A

ELMo and BERT are examples of models that create context-dependent embeddings.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is Sentence-BERT used for?

A

Sentence-BERT is used to create sentence embeddings, allowing for efficient comparison of sentence meanings.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Define TF-IDF and its purpose.

A

TF-IDF (Term Frequency-Inverse Document Frequency) is a method to weight terms in a document based on their frequency and importance, improving information retrieval.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How does TF-IDF work?

A

TF-IDF assigns higher weights to words that are frequent in a document but rare in the entire corpus, reducing the impact of common words.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is Intrinsic Evaluation in evaluating embeddings?

A

Intrinsic evaluation assesses embeddings by comparing algorithm-generated word similarity scores to human-annotated scores.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is Extrinsic Evaluation in evaluating embeddings?

A

Extrinsic evaluation tests embeddings in real NLP tasks (e.g., information retrieval) to measure their practical effectiveness.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are Bilingual Embeddings?

A

Bilingual embeddings align words from two languages in the same vector space, enabling cross-lingual tasks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the goal of Cross-Lingual Embedding?

A

Cross-lingual embedding aims to position words with similar meanings from different languages close together in vector space.

17
Q

Describe the Direct Transfer approach in bilingual embeddings.

A

Direct Transfer trains a model in a resource-rich language (e.g., English) and applies it to a low-resource language by mapping embeddings.

18
Q

What is the Naive Approach to bilingual embeddings?

A

The naive approach uses a bilingual dictionary to align words between languages, assigning similar embeddings to translation pairs.

19
Q

What is a Transformation Matrix in bilingual embeddings?

A

A transformation matrix aligns monolingual embeddings by learning a linear mapping between two languages based on word pairs.

20
Q

What are Multilingual Embeddings?

A

Multilingual embeddings map more than two languages into a shared vector space, supporting tasks across multiple languages.