NLP Flashcards

1
Q

What is TRUE for Natural Language Processing?

A

improves human-computer communication = TRUE
improves human-human communication = TRUE
improves computer-computer communication = FALSE
distills knowledge from texts TRUE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

ELIZA

A
  • was meant to be a parody on a Rogerian psychoanalyst
  • programmed by J. Weizenbaum (1966)
  • works by a very simple pattern matching on responses
  • has no understanding of the conversation
  • still one of the best-known AI chatbots
  • Weizenbaum’s intention WAS NOT to demonstrate AI
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How does ELIZA work?

A
  • Puts sentences into keywords
  • Keywords have rank/precedence numbers
  • Commas/Periods delimiters?
  • Analyzes input acc. to transformation rules by decomposing sentences
  • keyword + transformation rules = script
  • Response is generated by reassemly rules ass. to decomposition rules (e.g., replacing I with YOU)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

If ELIZA gets stuck, what does it do?

A
  • returns to keywords from prio convo
  • has a few stock answers
  • is sensitive to specific subjects (like family)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Cross three properties which are true about ELIZA!

A
  • can only respond with a question about the current input = FALSE
  • scans input sentence for keywords = TRUE
  • is not able to return to previous content =FALSE
  • uses stock answers sometimes = TRUE
  • uses artificial neurons, i.e. a neuronal net = FALSE
  • can be sensitive to specific subjects (e.g. subject family) = TRUE
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

PARRY

A

counter-part to ELIZA
imitates a paranoid schizophrenic
takes advantages of giving silly responses
it passed a restricted Turing TEST in 1970

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

STUDENT

A
  • was able to solve simple High School math problems
  • but you had to type your question in a natural language
  • viewed every sentence as an equation
  • used trigger words to identify the task
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

SHRDLU

A
  • natural language interface to the block’s world
  • could perform tasks, give name to objects, memorize operations & answer questions about the state of the world.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Match the chatbots:

A

SHRDLU = NL interface to the block’s world
PARRY = passed a restricted Turing test
ELIZA = responses are based on reassembly rules
STUDENT = could solve high school math problems

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Tokenization is a big problem in NLP.

A

TRUE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Word Segmentation/Tokenization

A
  • dividing the input text into small semantic entities
  • issues are z.b. New York vs. “New” and “York” one token or two tokens?
  • numbers have different formats
  • also different abrbreviation rules (US vs U.S.)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Part-Of-Speech-Tagging (POS)

A
  • Each word in a sentence can be assigned to a category
  • noun, verb, adjective etc.
  • based on the definition of the words (thesaurus)
  • POS uses definitions (thesaurus) and context (grammar rules) to decide the category
  • can be formulated as a sequence learning task such as Hidden Markov Models (HMMs) or Conditional Random Fields (CRFs)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is NOT a fundamental NLP task?

A

lexical analysis = TRUE
syntactic analysis
semantic analysis
word segmentation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Modern NLP systems use knowledge based MT systems instead of Deep Learning.

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

The key idea behind Word2Vec is that any word - represented as a vector will be mapped to
familiar words.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Which of the following is advantage of using a system like Word2Vec?

A

capable of capturing syntactic and semantic relationships between different words = FALSE
can have out-of-vocabulary words as well = TRUE
effort for humans to tag data was less, because it is a unsupervised technique = TRUE
vector size is not direct proportional to vocabular size = TRUE

17
Q

Machine Translation

A
  • has been studied early on
  • very hard because of the ambiguity
  • Neural Machine Translation allows end-to-end learning
  • input is the raw INPUT TEXT
  • Encoder = a neural network maps the input into the intermediate representation (–> embedding)
  • Decoder = another neural network maps the embedding into a different language text
  • Results are much better than previous system, but still not perfect
18
Q

Transformer Networks

A

BERT = Convolution
GPT = Recurrence

19
Q

NLP

A

= maps the query to a structured representation of the content

20
Q

3 out of 4 points are reasons why Natural Language Processing (NLP) is hard. Which point does not match the others?

A

The grammar is very complex for NLP = TRUE
Natural languages are semantically and in a discourse very ambiguous.
There are often hidden meanings that are not obvious from the message itself. (jokes, puns, sarcasm, …)
Natural languages are lexically and syntactically highly ambiguous.

21
Q

Neural Machine Translation allows end-to-end learning. In which order?

A
  1. Intermediate representation (embedding)
    input of the raw input text (not grammatical analysis)
  2. Encoder: a neural network maps the input into an
  3. Decoder: another neural network maps the embedding into a different langua- ge text
  4. Results typically better than previous systems (but still not perfect)
22
Q

The basic idea of ”Information Retrieval” is: a document is regarded as a vector in an n- dimensional space and is a linear combination of the base vectors. Linear algebra can be used for various computations.

23
Q

The definition (after Grishman 1997, Eikvil 1999) of ’Information Extraction’ is: ”The identification and extraction of instances of a particular class of events or relationships in a natural language text and their transformation into a structured representation (e.g. a database).”

24
Q

Natural Language Processing (NLP) distill knowledge from texts in different ways. Ther are ”information retrieval” (IR) and ”information extraction” (IE)
a) IR retrieves relevant documents from collections (e.g., search engines).

b) IE retrieves relevant information from documents (e.g., comparison shoppers).

25
Q

The ”Bag of Words Model” assumes that the document has been generated by drawing once a specified number of words (without replacing them) out of a bag of words.

26
Q

Which sentences about ”Bag of Words Model” are correct?

A

Each word in the bag can appear once. Each word is therefore drawn with the same probability. = FALSE
It’s like drawing letters out of a Scrabble-bag (without replacement).
The ”Bag of Words Model” assumes that the document has been generated by drawing
once a specified number of words out of a bag of words. = FALSE
It’s like drawing letters out of a Scrabble-bag, but with replacement. = TRUE
Words in the bag may occur multiple times, some more frequently than others. So each word is drawn with a different probability. = TRUE
”Bag of Words Model” assumes that the document has been generated by repeatedly drawing one word out of a bag of words. = TRUE

27
Q

It’s about the intuition of ”Information Retrieval” or ”The Vector-Space Model”.
Documents that are ’close together’ in the vector space talk about the same things.

28
Q

Semantic Analysis has traditionally two levels (Representing meaning of individual words and
Word sense disambiguation)
Nowadays Semantic Analysis will be often done via deep learning (semantic embeddings). There are semantic embeddings of words (e.g., Word2Vec) and of sentences (e.g., Bert)

29
Q

Key Idea of ”Word2Vec” (a semantic embadding) is to find a distributed word representation, i.e., each word is represented as a lower-dimensional, non-sparse vector. This allows, e.g., to compute cosine similarities between words. The general approach is to train a (deep) neural network in a supervised way using the context of a word as additional input.

30
Q

Is the following statement true?: Weizenbaums NLP-Model ”Eliza” already had a deep understanding of the conversations it held.

31
Q

Match the following descriptions to their associated term in NLP.

A
  1. Lexical = same word, different meanings
  2. Discourse = meaning of sentence might depend on previous sentence
  3. Syntactic = same sentence, different interpretations
  4. Semantic = sentence interpretation may depend on context
32
Q

Word Segmentation

A

Tokenization

33
Q

Syntactic Analysis

A
  • Parsing
  • grammatical analysis of a sentence to determine which word is in which part of the sentence
    • 1) S = NP VP
    • 2) NP = Det N + PP
    • 3) VP = V NP or VP PP
    • 4) PP = P NP
  • Marx-Brother’s Joke = Elephant and PJ’s
34
Q

Semantic Analysis

A

-Semantic Embeddings
-Finding a representation of an utterance (word, phrase, sentence, document) in Euclidean space
-Distance in this space captures similarity in meaning
-Utterances that are close to each other have similar meanings
-Traditional Approach -> Bag of Words, where each word has one dimension
- Nowadays semantic embeddings are done via deep learning
* of words in Word2Vec
* of sentences in Bert**

35
Q

WHY is NLP so hard?

A

NLP is highly ambiguous

Lexical:
The same word may have different meanings like zamok vs zamok

Syntactic:
The same sentence may have different interpretations. See: “The chicken is ready to eat”.

**Semantic: **
The interpretaion of the sentence may require a deeper understanding.

**Discourse: **
The meaning of the sentence depends on the prio sentences in the text.
1st Sentence:
“John told Paul he made a mistake.”
2nd Sentence:
“He apologized immediately.”

Hidden Meanings: like sarcasm, jokes etc.

36
Q

Match the words to their definitions in terms of why NLP is difficult:

A

Syntactic = the same sentence may have different interpretations
Discourse = the meaning of a sentence also depends on the previous sentences in the conversation/text
Semantic = the interpretation of a sentence may depend on its content
Lexical = the same word may have different meanings

37
Q

What is true for the ”Bag of Words” model?

A

-Different classes have different bags of words = TRUE
-Normally, bags do not contain high context depending words because they mostly don’t appear that often = TRUE
-Assumes document is created only with words from one subset of the vocabulary = FALSE
-Words in the bags can occur multiple times depending on their usual quantity of usage
in this context = TRUE

38
Q

The simple Na¨ıve Bayes classifier for a text uses the probability p(ti/c) with which the word ti = wi occurs in the document class c.

39
Q

What is NOT true about PARRY, the chatbot?

A

passed a restricted Turing test
was able to give absurd responses
was the predecessor of ELIZA = CORRECT
attempted to behave like a paranoid schizophrenic