3. Generative models for discrete data NEW Flashcards
<b>BAYESIAN CONCEPT LEARNING</b>
p. 67
<b>THE BETA-BINOMIAL MODEL</b> <b>Likelihood</b> - what's the sufficient statistics? - likelihood in beta-binomial model <b>Prior</b> - what's the conjugate prior? - what’s the conjugate prior of the Bernoulli (or Binomial) distribution? - how are the parameters of the prior called? - exercise 3.15 - exercise 3.16 - hyperparameters in uniform prior the the beta-binomial model <b>Posterior</b> - posterior in beta-binomial - what pseudo counts are? - what's equivalent sample size? - posterior MAP - posterior MLE - when MAP = MLE? - posterior mean - posterior variance <b>Posterior predictive distribution</b> - p(x|D) - add-one smoothing - beta-binomial distribution (def, mean, var) <b>A more complex prior</b>
p. 74
<b>THE DIRICHLET-MULTINOMIAL MODEL</b> <b>Likelihood</b> <b>Prior</b> <b>Posterior</b> - MAP and MLE <b>Posterior predictive</b>
p. 80
<b>NAIVE BAYES CLASSIFIERS</b>
- NBC definition
- binary, categorical, and real-valued features
<b>Model fitting</b>
- log p(D|theta)
- MLE
- BNBC
<b>Using the model for prediction</b>
- p(y=c|x,D)
- special case if the posterior is Dirichlet
- what if the posterior is approximated by a single point?
<b>The log-sum-exp trick</b>
<b>Feature selection using mutual information</b>
<b>Classifying documents using bag of words</b>
- Bernoulli product model (binary independence model)
- x_ij and theta_jc interpretation
- adapt the model to use the number of occurrences of each word
- burstiness phenomenon
- Dirichlet Compound Multinomial (DCM)
- What’s Pólya urn?
p. 84