WORD SENSES Flashcards
What is a word sense
refers to one of the meanings of a
word in linguistics
What is word sense disambiguation WSD
an NLP task of selecting which sense of a word is used in a given piece of text (e.g., a sentence) from a set of multiple known possibilities (sense candidates)
Where is WSD Applied
Machine translation
Search Engines
What are the Two Typical Types of WSD Approaches
Knowledge based approaches
Supervised Machine Learning approaches
What are knowledge based approaches for WSD
Use external lexical resources eg dictionaries
These days, most dictionaries are Machine Readable Dictionaries (MRD)
including some thesauruses, semantic networks (WordNet)
What are Supervised ML approaches for WSD
Using a labelled training example
What is the (Simplified) Lesk Algorithm
The simplified Lesk examines the overlap between sense definition of a word and its current context
1 Retrieve dictionary
2 Calculate the overlap between each sense definition and the current context
3 Choose the sense that leads to the highest overlap
What is the corpus + lesk approach (WSD)
To slightly improve the lesk algorithm we can add the just labelled example to the ‘examples’ list
We can also Weigh each overlapped word by introducing a weight (just counting or using idf)
How do we build a WSD classifier
Given a word-sense annotated corpus, we Characterise each word pattern (along with its context) with a set of features (feature extraction)
- Train a classifier using the training examples - Test the trained classifier using new examples
How to use Naive Bayes for WSD classifier
We want to find out the probability of seeing a word given a class
Eg P(fish | class1)
count the amount of times we see “fish” in all sets labelled class1
divided by the total number of words in all sets labelled class1(inc repetitions) + the total words in the dictionary(not inc repeititions)
P(word |class) = count(word,class) +1 / count(class) + |V|
What is sequence labelling
The task of assigning a sequence of labels to a sequence of words [tokens, observations]
WSD as sequence labelling (hidden markov model)
Use the transition probability x emission probability for each input word and output label