Building a feature-based Classifier Flashcards

1
Q

What is a classifier? What is its goal?

A

An agent which takes some input data and predicts the class of the data. Make as few mistakes as possible and represent its belief

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do we extract a simple set of features?

A

average each MFCC across all time segments

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What approach is used to build a classifier?

A

Bayes theorem. Find p(Class | feature x observed)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Give Bayes Theorem

A

p(a | b) = p(b | a)p(a) / p(b | a)p(a) + p(b | c)p(c) + …

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are p(x | C_1) and p(x | C_2) ?

A

probability density functions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a probability density function p(x)? What is the mean? What is the variance?

A

represents the probability p(x).dx that the variable x lies in the interval [x, x+dx]

mean = integral from -infinity to infinity x.p(x).dx
variance = standard deviation^2 = integral from -infinity to infinity  (x - mean)^2p(x).dx
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the probability density function of the normal distribution?

A

p(x) = 1 / root(2 pi standarddev^2) e^(-(x - mean)^2 / 2.standarddev^2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do we approximate p(x | C_1) and p(x | C_2)?

A

normal distributions fitted to some training data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How do we fit a normal distribution to data? How do we calculate these?

A

set its mean and variance equal to the empirical mean and variance of the data.

M = 1/n sum x from 1 to n
S^2 = 1/n sum (x(n) - m)_2 from 1 to n
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the probability of the feature vector x interpreted to mean? How can we think of the probability?

A

p(x) = p(x_1 and x_2 and … and x_d)

The probability can be thought of as a function with d arguments

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do we create a classifier from multiple features? What is this known as?

A

Bayes theorem still: p(c1 | x) = p(x | c1)p(c1) / p(x | c1)p(c1) + p(x | c2)p(c2)
we calculate p(x | c1) and p(x | c2) as:
p(x | c1) = p(x1 | c1)p(x2 | c1)…p(xd | c1)
p(x | c2) = p(x1 | c2)p(x2 | c2)…p(xd | c2)

Naive Bayes classifier

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Why is a naive Bayes classifier naive?

A

Because of the assumption that features are conditionally independent given knowledge of the class

How well did you know this?
1
Not at all
2
3
4
5
Perfectly