Topic 5: Word Sense Flashcards

1
Q

recap on word ambiguity

A

same word can be used to mean different things.

“mouse”

  • small rodent
  • hand-operated device to control a cursor

“bank”

  • hold investments in a custodial account
  • river bank

is called polysemous..from Greek word meaning having many senses.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is word sense?

A

discrete representation of one aspect of the meaning of word

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Wordnet

A

this is an online thesaurus. database that represents word senses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Word sense disambiguation

A

task of determining disambiguation which sense of a word is being used in a particular context.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Homonym and orthographic

A

example bank1 and bank2 have same orthographic form but sense are unrelated

in this case, it’s homonym.

if the words are homonym, it means the words that have the same spelling and pronunciation but different meanings and origins.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Homograph

A

words that are not necessarily pronounce the same and having different meanings and origins.

bow1 and bow2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Homophones

A

each of two or more words have same pronunciation but different meaning, origins or spelling.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Dictionary or thesauruses

A

document/database that give textual definition for sense called glosses.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Dictionary

A

contain many fine-grained senses to capture meaning differences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Glosses

A

not a formal meaning representation. is written for people

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Sentence embedding

A

sentences have glosses along to help build sense representation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Relation between sense: synonymy

A

two sense of two different words which are identical or nearly identical

synonymy is a relationship between senses rather than words

example
count/sofa
vomit/throwup
car/automobile

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Relation between sense: antonym

A
words with an opposite meaning
example:
long/short
big/little
fast/slow
cold/hot
dark/light
rise/fall
up/down
in/out
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Taxonomic relations: hyponym

and hypernym

A

One sense is a hyponym of another sense if the first sense is more specific

example
car is hyponym of vehicle
dog is hyponym of animal
mango is hyponym of fruit

hypernym is the other way around

superordinate is often used instead of hypernym
superordinate - subordinate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

meronymy

A

part-whoe relation

example

leg is part of chair

wheel is part of car

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Wordnet

A

English WordNet consists of nouns, verbs, adjectives-adverbs

8 sense for noun “bass”

usually have a gloss, synonym set and usage example

17
Q

Synset

A

set of near-synonyms for a Wordnet sense..

a way of representing concept

synsets are the fundamental unit associated with Wordnet entries.

also labels synset with lexicographic category drawn from a semantic field.

18
Q

Sense relation in wordnet

A

WordNet has two kinds of taxonomic entities: classes and instances

19
Q

Verb relation in Wordnet

A

shows the relation such as hypernym, troponym, entails, antonym

20
Q

Hyponymy Chain

A

Hyponymy chains for two
separate senses of the
lemma bass.

Note that the chains are
completely distinct, only
converging at the very
abstract level whole, unit

21
Q

Thesaurus Method

A

Use the structure of the thesaurus to define word similarity.

Can use any information from gloss to synonym etc.

In practice, hypernym/hyponym hierarchy is used

the intuition is
Words or senses are more similar if there is a shorter path
between them in the thesaurus graph.

22
Q

Path Length

A

Measure the number of edges between
the two concept nodes in the thesaurus
graph and adding one

pathlen(c1,c2) = 1 + the number of edges in
the shortest path in the thesaurus graph
between sense nodes c
1 and c2.

23
Q

Path-Length based Similarity

A

path-length based similarity equation . ….

For most application, we do not have sense-tagged data, hence word
similarity algorithm gives similarity between words, taking the
maximum sense similarity

word similarity equation . . .

24
Q

Information-Content Word Similarity

A

Rely on the structure of the thesaurus but also add probabilistic
information derived from a corpus

Define P(c) as the probability that a randomly selected word in a corpus
is an instance of concept

P(root) = 1, since any word is subsumed by the root concept.

Intuitively, the lower a concept in the hierarchy, the lower its
probability.

25
Q

Thesaurus with Probability

A
A fragment of the WordNet concept hierarchy augmented with the
probabilities P(c)
26
Q

Information Content Theory

A
Need two more definitions for the similarity computation, Information
Content Theory (IC) and Lowest Common Subsumer (LCS)

equation
IC(c) = - logP(c)

27
Q

Lowest Common Subsumer

A

LCS(c1,c2) = the lowest node in the hierarchy that subsumes both c1 and
c

28
Q

Resnik Similarity

A

Think of similarity between two words as related to their
common information.
sim resnik 𝑐𝑐1, 𝑐𝑐2 = −logP(𝐿CS(𝑐𝑐1, 𝑐𝑐2))

29
Q

Word Sense Disambiguation recap

A

the task of selecting the correct sense for a word

WSD algorithm takes the input of a word in context and a fixed
inventory of potential word senses and output the correct word sense
in context

30
Q

WSD Datasets

A

The inventory of sense tags depends on the task.

need to make sure it is from the same domain

use set of sense from resource like WordNet or supersense if want a coarser-grain set

31
Q

Baselines for WSD Systems

A

A surprisingly strong baseline
is simply to choose the most frequent sense for each word from
the senses in a labeled corpus

32
Q

Supervised Word Sense Disambiguation

A

Labeled dataset - context sentences labeled with the correct sense for
the target word

can use any standard classification algo

as for the feature
- collocation features of words or n-grams of lengths 1, 2, 3
- Bag of word – words that occur in the neighborhood
- Weighted average of embeddings
- Part-of-speech tags (for a window of 3 words on each side,
stopping at sentence boundaries)