Natural Language Processing Flashcards

Question 1

Q

What is term frequency?

Answer

A

How often does this term appear in the document (prominence)

Question 2

Q

What is inverse document frequency?

Answer

A

How infrequently does the term appear across docuemnts

Question 3

Q

What is Bayes Theorem?

Answer

A

P( A | B) = P(B |A) P(A)/P(B)

Question 4

Q

What is laplace smoothing?

Answer

A

Add one smoothing simply adds a constant to each count. This accounts for overfitting.

Question 5

Q

What is information theory?

Answer

A

Study of transmission, storage and retrieval of digital information.

Question 6

Q

What is entropy?

Answer

A

The average uncertainty of a random variable:
H(p) = H(X) = -weighted_sum(p(x)log2p(x))

Question 7

Q

What is joint entropy?

Answer

A

Specify 2 variables, information required for both
chgeck slides for formula

Question 8

Q

What is conditional entropy?

Answer

A

Civen one variable, how much info to specify to the other.
Check slides for formula

Question 9

Q

What is the chain rule?

Answer

A

H(X, Y) = H(x) + H(Y|X)

Question 10

Q

What is mutual information?

Answer

A

H(X) + H(Y|X) = H(Y) + H(X|Y)
therefore: H(X) - H(X | Y) = H(Y) - H(Y |X)

Question 11

Q

What is the noisy channel model?

Answer

A

First used in speech recognition, is used to reconstruct a message from an input channel.

Question 12

Q

What is a statistical language model?

Question 13

Q

What are the advantages and disadvantages of Neural networks?

Answer

A

Advantages:
- Unlimited input length
- Model size is independent of input size
- History dependent
- Model parameters shared across time steps

Disadvantages:
- Long delays are a problem
- Can’t see the future

Question 14

Q

What is the statistical language model?

Answer

A

mi = argmax p(zh|mi)p(mi)
Translation models have been estimated by aligned corpora, and this makes it hard to estimate p(zh|mi).
p(zh| mi) - is the translation model
p(mi) - is the language model

Question 15

Q

What are some facts about chatGPT?

Answer

A

Trained 93% english, 7% other.
By June 2020: 175 x 10^9 parameters defined f

Question 16

Q

What is the attention model?

Answer

Study These Flashcards

A

This model allows the RNN to pay attention to specific parts of the input that is considered as being important, which improves the performance of the resulting model in practise.
Check slide for formula

Question 17

Q

What is attention weight?

Answer

Study These Flashcards

A

The amount of attention that the output should pay to the activation .
Check slide for formula

Natural Language Processing Flashcards

(17 cards)