probabilistic approach to NLP Flashcards
what is the logical or knowledge-based approach to NLP
- rule based
e. g regular expression and finite automata
what is probabilistic approach to NLP ?
- use of theory of probability
e. g neural networks, kernel methods,
What is probabilistic modelling in NLP ?
- general framework for modelling NLP
- it use random variables, random configurations, and reasoning about the probabilities of the configurations
What are independent variables ?
P(V1 =x1, V2 =x2) = P(V1 =x1)P(V2 =x2)
what are conditionally independence variable ?
P(V1 = x1 , V2 = x2 |V3 = x3 ) = P(V1 =x1|V3 =x3)P(V2 =x2|V3 =x3) or P(V1 = x1 |V2 = x2 , V3 = x3 ) = P(V1 = x1 |V3 = x3 )
What are the 4 computation task in probabilistic modelling ?
- evaluation
- simulation
- inference
- learning
What is the evaluation task in probabilistic modelling ?
- calculate probability of a complete configuration
What is the simulation task in probabilistic modelling ?
- generate random configuration
or - producing full configurations according to a given model
What is the inference task in probabilistic modelling ?
- 3 tasks
- marginalization
- conditioning
- completion
What is the learning task in probabilistic modelling ?
- learning parameters of a model from data.
What is marginalization in inference task ?
- computing a marginal probability
What is conditioning in inference task ?
- computing a conditional probability
What is completion in inference task ?
- finding the most probable assignment of some variables
What is joint distribution model ?
- the probability of each complete configuration
- P(V1=x1,…,Vn=xn) in the probability table.
What is fully independent model ?
- all variable are independent.
P(V1 = x1 , …, Vn = xn ) = P(V1 = x1 ) · · · P(Vn = xn ).
Drawbacks of joint distribution model ?
- memory cost to store table
- expensive running time
- space data problem(no enough data to cover all options)
What is Bayes theory ?
naive babes theory ?
P(a|b) = P(b|a) · P(a)/P(b)
P(V2, V3, . . . , Vn|V1) = P(V2|V1) · P(V3|V1) · . . . · P(Vn|V1)
P(V1, V2, V3, . . . , Vn) = P(V1) · P(V2|V1) · P(V3|V1) · . . . · P(Vn|V1)
What are the advantage of NB model ?
efficiency: good running time and small memory size
sparse data problem : enough data to train
good performance: unrealistic independent assumption
what are the disadvantage of NB model?
- strong independence assumption
- only one output variable
What is smoothing ? and why we use it in probabilistic models ?
- avoid 0 probability
- modify estimated probabilities to correct errors in the dataset.
what are the smoothing techniques ?
- add-one smoothing (Laplace smoothing)
- Bell-Witten smoothing
What are the evaluation tasks in Hidden Markov Model(HMM) ?
- evaluation: use HMM assumption formula
- generation: generate in the order of graphical representation.
- inference: marginalization, conditioning, and completion
- learning: MLE if labeled are given.