Prediction and Part-Of-Speech Tagging Flashcards

Question 1

Q

Corpus

Answer

A

A body of text that has been collected for some purpose.

Question 2

Q

Balanced Corpus

Answer

A

Contains texts which represent different genres.

Question 3

Q

Prediction

Answer

A

Given a sequence of words, we want to determine what’s most likely to come next.

Question 4

Q

N-gram Model

Answer

A

A type of Markove Chain where the sequence of the prior n -1 words i sused to predict the next.

Question 5

Q

Trigram

Answer

A

Use preceding two words.

Question 6

Q

Bigram

Answer

A

Models the preceding word

Question 7

Q

Unigram

Answer

A

Use no context at all.

Question 8

Q

Bigrams Model

Answer

A

Assigns a probabilitiy to a word based on the previous word alone.

Question 9

Q

Viterbi Algorithm

Answer

A

A dynamic programming technique for efficiently applying n-grams in speech recognition and other applications to find the highest probability sequence. It is usually descibed in terms of an FSA.

Question 10

Q

Smoothing

Answer

A

To allow for sparse data - we use smoothing. This means that we make some assumption about the probability of unseen or very infrequently seen events and distribute that probability appropriately.

Question 11

Q

Add-one Smoothing

Answer

A

Add one to all counts - not sound theoretically but simple to implement.

Question 12

Q

Backoff

Answer

A

Backing off to lower n-gram probabilities.

Question 13

Q

Part of Speech Tagging

Answer

A

Associating words in a corpus with a tag indication some syntactic information that applies to that particular use of the word. POS tagging makes it easier to extract some types of information.

Question 14

Q

Stochastic POS-tagging

Answer

A

Too complex to make flash card from look at pages 22-24 in the notes.

Question 15

Q

Evaluation of POS tagging

Answer

A

POS tagging algorithms are evaluated in terms of percentage of correct tags. Success rates of 95% are misleading as baseline of choosing most common tag based on training set gives 90% accuracy.

Question 16

Q

Training data and test data

Answer

Study These Flashcards

A

The assumption in NLP is that a system should work on novel data. The test data must therefore be kept unseen.

Question 17

Q

Baselines

Answer

Study These Flashcards

A

Report evaluation with respect to a baseline, which is normally what could be achieved with a very basic approach, given the same training data.

Question 18

Q

Ceiling

Answer

Study These Flashcards

A

Ceiling for performance of an application. This is usually taken to be human performance on that task where ceiling is percentage agreement found between two annotators.

Question 19

Q

Error Analysis

Answer

Study These Flashcards

A

Error rate on a particular program will be distributed very unevenly. Some errors may also be more important than others e.g. treating an incoming order as junk is much worse than converse.

Question 20

Q

Reproducibility

Answer

Study These Flashcards

A

Evaluation should be done on a generally available corpus so that other researches can replicate the experiments.

Prediction and Part-Of-Speech Tagging Flashcards

(20 cards)