Week 1 - Intro to NLU and NLU tasks Flashcards

1
Q

What is NLP

A

Natural language processing
Converts unstructured data into a structured (posssibly machine-readable) form

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is NLU

A

Natural language understanding
A specification of NLP
Determines the intended meaning of natural language expressions; focuses on the comprehension of human language by machines
makes agents more intelligent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is NLG

A

Natural language generation
A specification of NLP
produces natural language expressions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the 4 main NLU task categories

A
  1. Sequence classification
  2. Pairwise sequence classification
  3. Sequence labelling
  4. Span-based operations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are single-problem applications of NLU tasks

A

Focus on addressing a specific task or problem within natural language understanding

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are multi-problem applications of NLU tasks

A

Involves addressing multiple NLU tasks within a single application or system
Relate complex applications to underlying NLU tasks, by decomposing them into subtasks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is sequence classification and what is its output

A

Takes a series of words (tokens) (eg sentence, tweet, document)
Output: classification category

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is pairwise sequence classification and what is its output

A

Classify relationship between 2 input sequencies
output: Neutral, contradicts, entails

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is sequence labelling and what is its output

A

Classification at the level of the individual tokens in the sequence
output: eg noun/verb
Can also use subsequence of tokens if the context if relevant
eg john smith = person

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a Unit

A

A subsequence of tokens

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the BIO scheme

A

B: beginning of a subsequence of interest

I: Inside the subsequence of interest
- Will also be used for last token in subsequence

O: outside of a subsequence of interest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are span-based operations

A

Analysis of a span - not necessarily a full sentence nor full document

Takes a sequence, finds all possible spans

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a span

A

A contiguous sequence of tokens

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the total number of spans found in a sequence with max span = T

A

Total = (T(T+1)) / 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the 3 subtasks of span operations

A

Identification
Classification
Relation Classification

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is Identification: span-based

A

Identifying spans of interest as binary classification
Input: sequence and question (eg find the keyphrases)
“her best groundstroke is her two-handed backhand”
output: “groundstroke” “two-handed backhand”

17
Q

What is classification: span-based

A

classifying spans according to a set of labels
Eg “United Airlines” : ORG

18
Q

What is an embedded entity

A

Eg ORG “United Airlines Holdings”,
ORG “United Airlines”
Here we have an ORG within an ORG

19
Q

Why can span-based classification be better than sequence labeling

A

It identifies embedded entitieis

20
Q

What is relation classification: span based

A

Classifying relations between spans
eg Output
EMPLOYEE-OF(”Jane Vickers”, “United Airlines Holdings”)

21
Q

Underlying NLU task: Sentiment Analysis

A

Identifying overall meaning: -ve, +ve, neutral
Sequence classification

22
Q

Underlying NLU task: Emotion Recognition

A

Identify the emotions in the input text (eg: sad, lonely)
Sequence classification

23
Q

Underlying NLU task: Hate Speech Detection

A

Determine whether text contains hate speech
Some also aim to define the type: race/gender/etc
Sequence classification

24
Q

Underlying NLU task: NLI

A

Hypothesis and Premise
true (entails) or false (contradicts) or neutral
Pairwise sequence classification

25
Q

Underlying NLU task: Paraphrase Identification

A

Determine whether one is a paraphrase of another (degree of similarity)
eg plagiarism
Pairwise sequence classification

26
Q

Underlying NLU task: NER

A

Identify subsequence corresponding to categories
Sequence labelling or span based classification

27
Q

Underlying NLU task: Entity Linking

A

Link subsequence to its standard form in vocabulary
Pairwise sequence classification or span based classification(find span from document)

eg linking a name in wikipedia article to its own wikipedia page

28
Q

Underlying NLU task: Semantic Role Labelling (SRL)

A

Identify predicate-argument structures: who did what to whom; which bit is linked grammatically (labelled as Pred (verb) or Arg (noun))
Sequence labelling or span-based classification

29
Q

Underlying NLU task: Relation extraction

A

Identify relation type that holds between two spans
EG Bill was born on April 13th in Seattle
Bill, Seattle: BORN-IN relationship

Span based relation classification

30
Q

Underlying NLU task: Coreference Resolution

A

Determine if the spans refer to the same real-world entity or concept
EG If “Tom” and “he” refer to the same real-world man

Span based relation classification

31
Q

Multi-underlying NLU task: Aspect-based sentiment Analysis

A

Identify the target and then aspect category
Eg “Horrible services. The room was dirty and unpleasant.”
Target: Room
Aspects: price, location, comfort, cleanliness → cleanliness
span-based classification(target)
span-based classification(aspect)
span-based relation classification

32
Q

Multi-underlying NLU task: Fact Verification

A

Determine whether information is supported by facts or not
involves 3 tasks:

  1. Claim identification
    determine whether piece of text is worth fact checking
    sequence classification or span based classification
  2. Evidence Retrieval
    find (within a pre-existing support corpus) pieces of text which are relevant to a given claim
    pairwise sequence classification
  3. Automated Verification
    determine if a piece of text contains information that is supported or refuted by provided pieces of evidence
    pairwise sequence classification
33
Q

Multi-underlying NLU task: Argument mining

A

Identify argumentative structures
involves 2 tasks:

  1. Argument component identificatoin
    identify claims and premises
    span-based classification
  2. Argument relation classification
    classify whether a premise supports a claim, i.e., whether the relationship between them is support or oppose
    span-based relation classification
34
Q

Multi-underlying NLU task: Question answering (Extractive)

A

Given two pieces of text; a passage(context) and a question
To identify the span of text that answers the question

(combination) pairwise, span-based identification
(identifying whether span is of interest in relation to the question)

35
Q

Multi-underlying NLU task: Event extraction

A

Given a sequence and list of named entities
To identify events, i.e., the event trigger and event participants
has 2 subtasks:

  1. Event Trigger Detection
    identify the word that denotes the event and its type
    span-based classification or
    sequence labelling
  2. Event Participant Identification (which named entitiesa are actually involved in the event)
    to determine the relationship that holds between a named entity and the event trigger
    span-based relation classification

eg given: Tom was hit by Harry, Tom = person, Harry = person
want to identify event trigger (“hit”) that indicates conflict
and who (tom and harry, attacker and victim) is involved

36
Q

What is the difference between Pairwise sequence classification and span-based relation identification

A

Pairwise sequence are usually taken from two different sources (the entire source)

span-based is a shortened sample - usually from the same source