Classification, with word sense disambiguation example Flashcards

1
Q

Supervised machine learning

A

Learn from training data where word sense is already labelled.

Apply to test data.

Approaches:

  • Bag of words
  • Word n-grams
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Support vector machines

A

Interpret the feature list geometrically, and try to derive a separating hyperplane between positive and negative

Encode features as numbers and think of them as coordinates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Naive Bayes classifiers

A

Probabilistic model

Learn

  • Probability for different labels
  • The probability of co-occurrence of a feature and a label
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

The assumptions used to derive the probability of a sense given a feature list

A
  • Bayes rule

- The assumption of independence between features

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How to compute the probability of a sense given a feature list

A

P(sense | feature list)

Using Bayes Rule, the equals
argmax P(feature list | sense) * P(sense)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How to compute the probability of a sense

A

P(sense)

= Count (sense) / Count (all sense)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How to compute the probability of a feature given a sense

A

Assume that all features appear independently:
P(feature list | sense)
= P(feature1 | sense) * … * P(featureN | sense)

Calculate using counts (maximum likelihood estimation)

P (“red” | well known person)
= count (label W & feature “red”)/
count (label W)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly