Week 5 - Word Sense Disambiguation Flashcards

Question 1

Q

Word Sense

Answer

A

one of the meanings of a word in a linguistic

Question 2

Q

Word Sense Disambiguation

Answer

A

(WSD) is the NLP task of selecting which sense of a word is used in a given piece of text (e.g. a sentence) from a set of multiple known possibilities

Question 3

Q

WSD applications

Answer

A

Machine translation - lexical choices for words that need different translations for different senses

Information Retrieval - Search choices for queries that are relevant to different topics for different senses

Bioinformatics - Assign a species identifier (e.g. human, mouse) to a gene and gene product entity (e.g. proteins)

Medical Applications - Find the correct meaning of acronyms in clinical text

Question 4

Q

Typical WSD approaches

Answer

A

Knowledge-based
- Use external lexical resources like dictionaries, thesaurus

Supervised machine Learning
- Use labelled training examples

Question 5

Q

Lesk Algorithm

Answer

A

Examine the definition overlap in all possible sense combinations among all the words in a given text

Implementation
- Retrieve from the dictionary all sense definitions of the words in the given piece of text
- Calculate the definition overlaps for all possible sense configurations
- Choose the senses that offers the highest overlap

Disadvantage:
Very impractical for long sentences

Disambiguating all words in the sentence takes m1xm2xm3x…xmn where mi is the number of definitions of the ith word

Question 6

Q

Simplified Lesk Algorithm

Answer

A

A faster version of Lesk for longer sentences

Examines overlap between sense definition of a word and its current context

Compare the senses to the context (the given sentence)

Disambiguating all words in the sentence takes m1+m2+m3+…+mn where mi is the number of definitions of the ith word

Question 7

Q

Corpus Lesk approach

Answer

A

Enhance performance using labelled data

Enhance the sense definition with labelled data

Add labelled examples to the definitions

Weigh each overlapped word using a weight
Examples says the idf of the word overlapping between the target sentence and the sense definition

Question 8

Q

Supervised machine learning - goal

Answer

A

To predict the output for an input data pattern

Question 9

Q

Training examples

Answer

A

A set of example data patterns are provided, where the ground-truth output is known for each example

Question 10

Q

Predictive mapping

Answer

A

A mapping from an input data pattern and the desired output built from training examples

Question 11

Q

Annotated training corpus

Answer

A

A collection of training examples

Question 12

Q

Classification

Answer

A

Assign an input data pattern to one of a pre-defined set of classes (categorical output)

Question 13

Q

Converting WSD to classification

Answer

A

An input data pattern: a word in context

Pre-defined set of classes: dictionary senses (called tag set)

Training corpus: A collection of words tagged in context with their sense

One option is to train one classifier to identify the sense for one word. N words in the dictionary requires to build N classifiers

Question 14

Q

Building a WSD classifier

Answer

A

Given an annotated corpus

Find a way to characterise each word pattern (along with its context) with a set of features (feature extraction)

With existing tools:
Choose a classifier (classification algorithm)

Train the classifier using the training examples

Test the trained classifier using new examples (evaluation)

Question 15

Q

Bag of word features (WSD)

Example:
“An electric guitar and bass player stand off to one side not really part of the scene”
+/-2 window, what is the set of features for “bass”

Answer

A

Based on words occurring anywhere with a window of the target word

Consider frequency (occurrence counts)

Answer: {guitar, and, player, stand}

Question 16

Q

Feature vector

Example:
[fishing, sound, player, fly, rod, pound, stand, runs, guitar, band]

Convert: {and, guitar, player, stand} into a feature vector

Answer

A

[0, 0, 1, 0, 0, 0, 1, 0, 1, 0]

Question 17

Q

Naive bayes classifier for WSD

Do Question 4 of week 5 homework sheet

Answer

A

See Week 5 homework solution sheet

Question 18

Q

Classifier

Answer

A

A chosen classifier can compute (x input feature vector, y class)
- discriminant function f(x)
- posterior probabilities p(y|x)
- Joint probabilities p(x,y)
- Conditional Probabilities p(x|y)

Possible Options:
- logistic regression
- Fishers linear discriminant analysis
- naive bayes classifier
- support vector machine
- neural networks
- k-nearest neighbour
- …

Question 19

Q

Generating more training examples

Answer

A

One sense per collocation - A word recurring in collocation with the same word will almost surely have the same sense.
e.g. “play” often occurs with the music “bass”
“fish” often occurs with the fish “bass”

One sense per discourse - The sense of a word is highly consistent within a document, especially topic specific words

Automatically generate more training examples with rules, to be combined with hand-labelled training examples.

This is considered semi-supervised learning

Question 20

Q

sequence labelling - popular machine learning techniques

Answer

A

Structured support vector machines

Conditional random field

Hidden Markov model

Recurrent neural network

Question 21

Q

WSD Evalutions

Answer

A

Check sense accuracy (intrinsic evaluation)
- % of words tagged identically with human-manual sense tags
- usually evaluate using held-out data from same labelled corpus (train-test split)

Task Based Evaluation (extrinsic evaluation)
Embed WSD in an NLP task (e.g. use the results from WSD in a machine translation task, and see if the WSD makes the translation results better) and see if you can do the task better

Question 22

Q

WSD baselines for comparison

Answer

A

Assign the most frequent sense
Simplified Lesk (~42% accuracy)

Human agreement on all-words corpora with WordNet style senses are around 75%-80%