Question Answering Flashcards

Question 1

Q

Question answering (QA) main focus

Answer

A

QA systems focus on factoid questions, that is, questions that can be answered with simple facts.

Question 2

Q

Question answering (QA) 2 paradigms

Answer

A

Text-based QA:

use efficient algorithms (information retrieval) to find relevant documents from text collection
use reading comprehension algorithms (machine reading) on relevant documents to select span of text containing the answer

Knowledge-based QA:

produce a semantic representation of the query
match the semantic representation against fact databases

Question 3

Q

Text-based QA steps

Answer

A

1. Information retrieval (IR): maps input queries to a set of documents from some collection, ordered by relevance.

2. Machine reading:
- the input is a factoid question along with a passage that could contain the answer
- the output is the answer fragment, or else NULL if there is no answer in the passage

EXAMPLE:
question: “How tall is Mt. Everest?”
passage: “Mount Everest, reaching 29,029 feet at its summit, is
located in Nepal and Tibet …”
output fragment: “29,029 feet”

Question 4

Q

Machine reading model

Answer

A

Let q = q1, …, qn be a query and let p = p1, …, pm be a passage, where qt and pt are tokens.

A span is any fragment pi, …, pj of p.

The goal is to compute the probability P(pi, …, pj|q, p) that span pi, …, pj is the answer to q.

Assumption:
P(pi, …, pj|q, p) = Pstart(i|q,p)*Pend(j|p,q)

Question 5

Q

Answer span extraction using contextual embeddings; start and end vectors, score and fine-tuning loss

Answer

A

Pre-trained BERT (Bidirectional Encoder Representations from Transformers) is used to encode question and passage as strings separated by a [SEP] token:

Let e(pi) be the BERT embedding of token pi within passage p.
Start vector S is learned to estimate start probabilities for each position i, using dot product and softmax:
- Pstart(i|q,p)=exp(Se(pi))/sum over j of exp(Se(pj))
- Similarly, we learn end vector E to estimate Pend(j|q,p).
The score of a candidate span from position i to j is:
- score = Se(pi) + Ee(pj)

TRAINING:

For each training instance, compute the negative sum of the log-likelihoods of the correct start and end positions:

L = -logPstart(i|p,q) - logPend(j|p,q)

Averaging for all instances provides the fine-tuning loss

Question 6

Q

Negative examples in contextual embeddings

Answer

A

Many datasets contain negative examples, that is, (q,p) pairs in which the answer to q is not in the passage p.

Negative examples are conventionally treated as having start and end indices pointing to the [CLS] special token.

Question 7

Q

Stanford attentive reader; bilinear product and attention

Answer

A

Stanford attentive reader is a neural model based on the RNN and it uses an attention-like mechanism.

Assume query q = q1, . . . , qN and passage p = p1, . . . , pM, where qt and pt’ are tokens.

Let e(qt), e(pt’) be non-contextual embeddings associated to tokens qt and pt’ respectively.

We use bidirectional LSTM to encode individual tokens from query and passage:

We start with monodirectional embeddings: write them
We concatenate monodirectional embeddings to encode individual passage tokens: write it
We pick up boundary query embeddings to encode the entire query q: write it
We derive an attention distribution by computing a vector of bilinear products and by applying softmax: write them
We then use attention to combine passage tokens, and compute an output vector: write it
Finally, the score of each candidate c is computed as the inner product: write it

Question 8

Q

Machine reading system evaluation measures

Answer

A

Machine reading systems often evaluated using two metrics

Exact match: percentage of predicted answers that match the gold answer exactly
F1 score: compute precision and recall for system and gold answers, viewed as bag of tokens, and return average F1 over all questions.

Question 9

Q

Knowledge-based QA, entity linking

Answer

A

Text-based QA use textual information over the web (unstructured).

Knowledge-based QA answers a natural language question by mapping it to a query over some structured knowledge repository.

Two main approaches to knowledge-based QA:

Graph-based QA: models the knowledge base as a graph, with entities as nodes and relations as edges.
QA by semantic parsing: maps queries to logical formulas, and query a fact database.

Both approaches to knowledge-based QA require algorithms for entity linking.

Question 10

Q

Entity linking, mention

Answer

A

Is the task of associating a mention in text with the representation of some real-world entity in an ontology.

The most common ontology for factoid question-answering is Wikipedia, in which case the task is called wikification.

Entity linking is done in (roughly) two stages:

mention detection
mention disambiguation

Question Answering Flashcards

(10 cards)