Module 10 - Probabilities and Language Models Flashcards
(6 cards)
Which of the following must be true of the probabilities of two events, A and B, if A is independent of B?
π(π΄β£π΅)=π(π΄)
π(π΅β£π΄)=π(π΅)
π(π΄β¨π΅)=π(π΄)+π(π΅)βπ(π΄β§π΅)
π(π΄β§π΅)=π(π΄)π(π΅)
π(π΄β£π΅)=π(π΄)
π(π΅β£π΄)=π(π΅)
π(π΄β¨π΅)=π(π΄)+π(π΅)βπ(π΄β§π΅)
π(π΄β§π΅)=π(π΄)π(π΅)
Which of the following are equivalent to the joint probability π(π1,π2,π3,π4)?
Check all that apply.
π(π1)π(π2β£π1)π(π3β£π2)π(π4β£π3)
π(π1β£π2,π3,π4)π(π4)π(π3β£π4)π(π2β£π3,π4)
π(π2,π1)π(π3β£π2,π1)π(π4β£π3,π2,π1)
π(π1)π(π2β£π1)π(π3β£π2,π1)π(π4β£π3,π2,π1)
π(π1β£π2,π3,π4)π(π4)π(π3β£π4)π(π2β£π3,π4)
π(π2,π1)π(π3β£π2,π1)π(π4β£π3,π2,π1)
π(π1)π(π2β£π1)π(π3β£π2,π1)π(π4β£π3,π2,π1)
Chain rule.
T/F
In a bigram model, one assumes that words w{i} and w{i - 2} are independent for i > 2.
True
Bigram model only considers the w{i - 1}.
When we compute a tri-gram, we normalize the following to add up to 1:
the probabilities of all words w given w{i - 1}, w{i - 2}
Trigram model considers the previous two words.
How can we estimate the probability of a sentence P(w1, w2, β¦, wN)?
By applying the chain rule.
Definition of conditional probability.
What is the effect of the Markov assumption in n-gram language models?
It makes it possible to estimate the probabilities from data.
N-gram could learn the conditional probability from the document.