Week 6 - Sequence Labelling Flashcards

Question 1

Q

What is the task of sequence labelling defined as

Answer

A

the ask of
-assigning a label yi to each token xi in an input token sequence X
-the output sequence Y has the same length as X

Question 2

Q

What are the 8 main word classes for POS tagging

Answer

A

Nouns
pronouns
verbs
adjectives
adverbs

determiners
conjunctions
prepositions

Question 3

Q

what are determiners

Answer

A

a, the, an
used to specify nouns

Question 4

Q

what are prepositions

Answer

A

in, of, from
denote spatial information

Question 5

Q

What are word classes defined base on:

Answer

A

(1) Their grammatical relationship with neighbouring words
eg I went for a walk/ I will walk to work

(2) morphological properties (eg of suffixes)
dance (VB), danced (VBD), dancing (VBG)
ie VBD - past tense

Question 6

Q

What are POS tags broadly categorised into

Answer

A

closed class v open class

Question 7

Q

what is a closed class

Answer

A

members are fixed; unlikely that new words are added

Question 8

Q

what is an open class

Answer

A

new words likely to be added/coined over time

Question 9

Q

POS: Task, Input, Output

Answer

A

Task: assign a POS tag to each word in a sequence
Input: sequence x1,…,xn of words and a tagset
Output: sequence y1,..,yn of tags where each tag corresponds to input token
eg Janet/NOUN will/AUX …bill/NOUN

Question 10

Q

Why is POS tagging difficult

Answer

A

words are syntactically ambiguous
“back” -> noun, adjective

Question 11

Q

How do we measure POS accuracy

Answer

A

proportion of POS tags that match gold standard POS tags

Question 12

Q

What is Semantic Role Labelling (SRL)

Answer

A

identifying predicate-argument structures
Answers the question:
“who did what to whom where and when?”

Question 13

Q

What is predicate (SRL)

Answer

A

word(s) expressing the event
ie the what

Question 14

Q

What is argument (SRL)

Answer

A

the participants in the events
ie the who, whom, where, when

Question 15

Q

What is the semantic role (SRL)

Answer

A

the role that each argument (of a predicate) takes

Question 16

Q

What is the Task of SRL

Answer

A

automatically find the semnatic role of each argument of each predicate (every argument and predicate)
in principle: the predicate is pre-identified in the input
in practice: most SRL models detect the predicate

Question 17

Q

What is the Propositional Bank scheme (SRL)

Answer

A

roles are specified to a verb and named with numbers
“the waiter spilled the soup”
ARG0: agent (initiator of an action) “the waitor”
ARG1: patient (entity undergoing the effect of an action) “the soup”
ARG2 and so on: depends on the verb/action

Question 18

Q

What is the FrameNet scheme (SRL)

Answer

A

Frame: a set of related concepts that together comprise background knowledge on some events

Question 19

Q

what are the main roles that exist in the framenet scheme

Answer

A

item, attribute, initial_value, final_value, difference
not every sentence will have all of these elements

Question 20

Q

What is Named Entity Recognition (NER)

Answer

A

named entity: anything that can be referred using a proper name (Person, organisation, location, geo-political entity)
But can also include expressions like dates, times, prices

Question 21

Q

Why is NER hard

Answer

A

ambiguous
Washington - name, organisation, location adn GPE

Question 22

Q

How can NER be applead to sequence labelling

Answer

A

Individual tokens are assigned named entity tags
BIO - beginning, inside, outside
IO - inside, outside
BIOES - beginning, inside, outside, end, single
single means consists of only one token

Question 23

Q

What are Conditional Random Fields (CRFs)

Answer

A

Model that discriminates among all possible tag sequences
Y^ = argmax P(Y|X)
It assigns a probability to an entire sequence Y for every possible sequence in y, given the input sequence X
(selects the highest probability)

Question 24

Q

What is a global feature (CRFs)

Answer

A

Fk
Property of the entire sequences X and Y, which is a sum of local features at each position i in Y

Question 25

Q

What is a local feature (CRFs)

Answer

A

fk
makes use of current output token y, previous output token yi-1, any part of the input sequence X and current position i

Question 26

Q

What is K and wk in CRFs

Answer

A

K = number of features
wk = feature weight

Question 27

Q

What is Z(X) in CRFs

Answer

A

The normalisation factor: to make sure all probabilities equal to 1

Question 28

Q

What is word shape features (CRF)

Answer

A

Abstract letter pattern of a given word
all lowercase letter mapped to ‘x’, all uppercase to ‘X’
all digits mapped to ‘d’, all punctuation retained

Question 29

Q

What is short word shape features (CRF)

Answer

A

like word shape but consecutive character types removed
eg token = I.M.F
word shape = X.X.X, short word shape: X.X.X
eg2 token = DC10-30
word shape = XXdd-dd, short word shape: Xd-d

Question 30

Q

What is affixes feature (CRFs)

Answer

A

prefixes and/or suffixes of size 1 to n

Question 31

Q

What is gazetteer feature (CRFs)

Answer

A

presence of a word in a dictionary of entities (of interest)

Question 32

Q

How are features binarised

Answer

A

every feature eg NNP pos tag is turned into a binary feature
so can be 0 or 1 depending on the token

Question 33

Q

How is BERT finetuned for sequence labelling

Answer

A

On top of BERT, add a classifier (eg single feedforward layer)
-takes as input the output vector for each token
- produces a softmax distribution over all tags
- label with highest probability chosen as output

Question 34

Q

Why is BERT a local approach

Answer

A

Does not take into account dependencies between tags
eg B-PER followed by B-PER
(should be very uncommon to happen)

Question 35

Q

What is ## in BERT

Answer

A

represents a subword token
Crashes - > Crash, ##es

Question 36

Q

How to we solve BERT as a local approach

Answer

A

Add a CRF layer on top of the classifier
- take the softmax output from the classifier
- pass it on to the CRF layer, which takes a global approach: takes into account the label of the previous token

Brainscape's Knowledge GenomeTM

Week 6 - Sequence Labelling Flashcards

Brainscape's Knowledge Genome^TM