05 Bayesian Flashcards

Question 1

Q

Independence:

Conditional Independence: P(A,B|C) =…
P(A|B,C) = …
–>

issues of Naive Bayes classifiers

Answer

A

P(A,B) = P(A) * P(B)

P(A|C) * P(B|C);
P(A|C)
A conditionally independent of B given C.

too many redundant attributes will cause problems (e.g., identical attributes).
many numeric attributes are not normally distributed.
Time complexity
Calculating conditional probabilities: Time 𝑂(𝑛) where 𝑛 is the number of
instances.
Calculating the class: Time 𝑂(𝑐𝑝) where 𝑐 is the number of classes, 𝑝 the
attributes.

Question 2

Q

Numeric Data: Unknown Distribution
Consider a random variable 𝑋 whose distribution 𝑓(𝑋) is unknown but a sample with a non-uniform distribution:
{𝑥1,𝑥2,…,𝑥𝑛}

Answer

A

Kernel Density Estimation
We want to derive a function 𝑓(𝑥) such that

(1) 𝑓(𝑥) is a probability density function, i.e.
∫ 𝑓 𝑥 𝑑𝑥 = 1
(2) 𝑓(𝑥) is a smooth approximation of the data points in 𝑋
(3) 𝑓(𝑥) can be used to estimate values 𝑥* which are not in {𝑥1,𝑥2,…,𝑥𝑛}

Rosenblatt-Parzen Kernel-Density-Estimator:

Question 3

Q

Learning Bayes Nets
Parameter Learning: Method for…
* Conditional distributions need to be learned from data
 Maximize the … and summarize the log-likelihood of training data based
on the network
* Evaluation criteria:
 Akaike information Criterion (AIC): −2𝐿𝐿 + 2𝐾  …
Structure Learning: Method for …
* Amounts to searching through sets of edges because nodes are fixed * Examples: K2, Tree Augmented Naive Bayes (TAN)

Answer

A

evaluating the goodness of a given network

joint probability of training data given the network via maximum likelihood estimation

Minimize AIC, with 𝐾=number of parameters

searching through space of possible networks

05 Bayesian Flashcards

(3 cards)