Probability Theory and Bayesian Learning Flashcards

1
Q

What is the Sum rule?

A

P(A v B) = P(A) + P(B) - P(A ∧ B)
Prob. of A or B is equal to prob. of A plus prob. of B, minus prob. of A and B

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the chain rule?

A

P(A ∧ B) = P(A|B) * P(B)
Probability of A and B is equal to the probability of A given B multiplied by prob. of B

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the formula for P(A|B)?

A

P(A|B) = P(A ∧ B) / P(B)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do we make predictions with multiple pieces of data?

A
  • De couple the multiple pieces of data
  • Treat each of them as ‘independent’
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the formula for naive bayes?

A

P(h | multiple data) = P(data(1) | h) *P(data(2) | h) … * P(data(n)) * P(h)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What do naive Bayes classifiers assume?

A

They assume that the value of any particular feature is independent of any other feature

Conditional Independence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the issue if one of the features has probability of 0, and what do we do about it?

A

Then the entire function evaluates to 0.

We apply some addition of a small ( < 0.000001) value to the ones with probability of 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are some other ways to use naive bayes?

A
  • Part of a KNN classifier
    • Use the priors in membership calculations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are some applications of naive bayes?

A
  • Used as a standard to compare other algorithms to
  • Spam filtering
  • Text classification
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What happens with values in real-valued domains?

A

Assume that P(X|Y) lies on a normal gaussian distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How can the gaussian distribution be described?

A

It can be described with a probability density function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the gaussian PDF?

A

p(x) = (1 / ( sq.rt. ( 2pi * std.dev^2) ) )^e^( -0.5 ( ( x - mean ) / variance) ) ^2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

When and how is naive Bayes flawed?

A

It is flawed when two datasets overlap to a large extent.

Because this means that the mean and std. dev. would be similar, giving us a similar gaussian distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does a bayesian belief network represent?

A

The state of some model.

They describe how states are related by their probabilities

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How can any system be modelled by a BBN?

A

All the possible states of the model are the possible worlds that can exist, or how all the possible ways that the parts of the system can be configured

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are some examples of BBNs?

A
  • Stock market
  • House diagnostics
  • Car diagnostics
  • Ecosystem simulation
17
Q

What does a BBN contain?

A

A set of interconnected nodes, where each node represents a variable in the dependency model

A set of edges, where each edge represents relationships between variables

18
Q

What are the advantages of Bayes learning?

A
  • No explicit model - easy to modify ‘on the fly’
  • Works well on small amounts of data
  • Naive Bayes classifiers tend to be robust
    • Generally underfit
  • Naive Bayes models are efficient
  • Easy to implement
19
Q

What are the disadvantages of Bayes learning?

A
  • Handling of continuous data is bad
    • Not everything can be represented by a
      Gaussian
  • Conditional independence assumption in
    Naive Bayes can be restrictive
  • Sparse data can be problematic