Week 9 DSE Flashcards

Question 1

Q

What does likelihood mean?

Answer

A

Prob of observing data x if we think that y equals to some number

Question 2

Q

What can bayes rule be used for ?

Answer

A

binary, discrete vairable

Question 3

Q

What does fk (X) mean?

Answer

A

conditional probability mass function of X (if X discrete) for an observation from class k.

Question 4

Q

What is 𝜋𝑘 ?

Answer

A

prior probability that random observation comes from class k

easily estimated under random sampling: just a fraction of the training observations that belong to class k.

Question 5

Q

What is the assumption under naive bayes classifier?

Answer

A

Assumes that within class k, all p predictors are independent.

ELIMINATES THE NEED FOR JOINT DISTRIBUTION

fk(x)= fk1(x1) x fk2(x2) x fk3(x3)

Question 6

Q

What to do when joint distribution is multivariate normal?

Answer

A

Estimate covariance matrix (tough in high dimension)

Question 7

Q

What is another option for quantitative/continuus X besides drawn from normal distribution?

Answer

A

Use non parametric density estimation

make a histogram for observation of jth predictor with each class

or use kernel density estimator

Question 8

Q

How to estimate a qualitative X?
What odes it mean to be qualitative?

Answer

A

qualitative means x discrete

count the proportion of training observations for the j-th predictor

Question 9

Q

What is it called when pi 1 and pi2 both equal 0.5, IF THERE ARE 2 CLASSES

Answer

A

called flat prior

Question 10

Q

What assumption does the naive bayes function in r rely on ?

Answer

A

assumption of Normal distribution for quantitative predictors.

Question 11

Q

What do we ask ourselves when applying statistical methods?

Answer

A

are assumptions valid?

Question 12

Q

What is the purpose of assuming predictors independent conditional on class?

Answer

A

mainly ensures quantitative tractability

Question 13

Q

What is a potential source of high variance? How does naive bayes solve it?

Answer

A

Estimating joint distribution of predictors when P gets larger becomes a very hard problem requiring a lot of data

simplifies this problem, introducing some bias, but reducing variance

Naive bayes is a relatively simple method with little tendency to overfit.

Question 14

Q

When does naive bayes reduce variance drastically?

Answer

A

when N is not large relative to P

Question 15

Q

When does naive bayes work best?(slide 29)

Answer

A

Works best when data has relatively many predictors so that reliable estimation of joint densities of predictors for each class is hard to achieve.

Week 9 DSE Flashcards

(15 cards)