Natural Language Processing Flashcards

1
Q

What is Natural Language Processing (NLP)?

A

NLP is the area of AI focused on enabling computers to understand, interpret, and generate human language.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the two main areas of NLP?

A

Natural Language Understanding (NLU) and Natural Language Generation (NLG).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the difference between NLU and NLG?

A

NLU focuses on interpreting and understanding human input, while NLG involves generating human-like language from data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Why is natural language difficult for machines to process?

A

Due to ambiguity, variability, context-dependence, and the complexity of human language.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is ambiguity in language?

A

When a sentence or phrase has multiple possible interpretations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is an example of syntactic ambiguity?

A

“I saw the man with the telescope” – it’s unclear who has the telescope.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the basic steps of an NLP pipeline?

A

Tokenization, POS tagging, parsing, named entity recognition, semantic analysis, etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is tokenization?

A

Splitting text into individual units such as words or sentences.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is POS tagging?

A

Part-of-speech tagging assigns word categories like noun, verb, etc., to each token.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a language model?

A

A model that assigns probabilities to sequences of words.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the purpose of a language model in NLP?

A

To predict the next word in a sentence or evaluate the likelihood of a sentence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are common types of language models?

A

N-gram models, neural language models, transformer-based models.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is an N-gram in NLP?

A

A sequence of N words used for modeling language.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Give examples of bigrams and trigrams.

A

Bigram: “I am”, Trigram: “I am happy”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How is the probability of a sentence estimated in an N-gram model?

A

By multiplying the probabilities of the individual N-grams.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are limitations of N-gram models?

A

They have limited context and suffer from data sparsity.

17
Q

What is smoothing in N-gram models?

A

A technique to handle unseen N-grams by adjusting probabilities.

18
Q

Name a common smoothing technique.

A

Add-one (Laplace) smoothing.

19
Q

How do neural language models improve over N-gram models?

A

They learn word embeddings and can model longer dependencies.

20
Q

What are word embeddings?

A

Vector representations of words capturing semantic similarity.

21
Q

What is Word2Vec?

A

A model that learns word embeddings by predicting word contexts (or vice versa).

22
Q

What are the two main architectures of Word2Vec?

A

CBOW (Continuous Bag of Words) and Skip-Gram.

23
Q

What does CBOW do?

A

Predicts a word from its surrounding context.

24
Q

What does Skip-Gram do?

A

Predicts context words from a target word.

25
Q

What metric is used to evaluate language models?

A

Perplexity.

26
Q

What does a lower perplexity indicate?

A

Better language model performance (more confident predictions).

27
Q

Name some applications of NLP.

A

Machine translation, sentiment analysis, chatbots, information retrieval, etc.