Classification, with word sense disambiguation example Flashcards

Question 1

Q

Supervised machine learning

Answer

A

Learn from training data where word sense is already labelled.

Apply to test data.

Approaches:

Question 2

Q

Support vector machines

Answer

A

Interpret the feature list geometrically, and try to derive a separating hyperplane between positive and negative

Encode features as numbers and think of them as coordinates

Question 3

Q

Naive Bayes classifiers

Answer

A

Probabilistic model

Learn

Question 4

Q

The assumptions used to derive the probability of a sense given a feature list

Answer

A

- The assumption of independence between features

Question 5

Q

How to compute the probability of a sense given a feature list

Answer

A

P(sense | feature list)

Using Bayes Rule, the equals
argmax P(feature list | sense) * P(sense)

Question 6

Q

How to compute the probability of a sense

Answer

A

P(sense)

= Count (sense) / Count (all sense)

Question 7

Q

How to compute the probability of a feature given a sense

Answer

A

Assume that all features appear independently:
P(feature list | sense)
= P(feature1 | sense) * … * P(featureN | sense)

Calculate using counts (maximum likelihood estimation)

P (“red” | well known person)
= count (label W & feature “red”)/
count (label W)