Question Answering Flashcards

1
Q

Question answering (QA) main focus

A

QA systems focus on factoid questions, that is, questions that can be answered with simple facts.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Question answering (QA) 2 paradigms

A

Text-based QA:

  1. use efficient algorithms (information retrieval) to find relevant documents from text collection
  2. use reading comprehension algorithms (machine reading) on relevant documents to select span of text containing the answer

Knowledge-based QA:

  1. produce a semantic representation of the query
  2. match the semantic representation against fact databases
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Text-based QA steps

A

1. Information retrieval (IR): maps input queries to a set of documents from some collection, ordered by relevance.

2. Machine reading:
- the input is a factoid question along with a passage that could contain the answer
- the output is the answer fragment, or else NULL if there is no answer in the passage

EXAMPLE:
question: “How tall is Mt. Everest?”
passage: “Mount Everest, reaching 29,029 feet at its summit, is
located in Nepal and Tibet …”
output fragment: “29,029 feet”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Machine reading model

A

Let q = q1, …, qn be a query and let p = p1, …, pm be a passage, where qt and pt are tokens.

A span is any fragment pi, …, pj of p.

The goal is to compute the probability P(pi, …, pj|q, p) that span pi, …, pj is the answer to q.

Assumption:
P(pi, …, pj|q, p) = Pstart(i|q,p)*Pend(j|p,q)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Answer span extraction using contextual embeddings; start and end vectors, score and fine-tuning loss

A

Pre-trained BERT (Bidirectional Encoder Representations from Transformers) is used to encode question and passage as strings separated by a [SEP] token:

  1. Let e(pi) be the BERT embedding of token pi within passage p.
  2. Start vector S is learned to estimate start probabilities for each position i, using dot product and softmax:
    - Pstart(i|q,p)=exp(Se(pi))/sum over j of exp(Se(pj))
    - Similarly, we learn end vector E to estimate Pend(j|q,p).
  3. The score of a candidate span from position i to j is:
    - score = Se(pi) + Ee(pj)

TRAINING:

For each training instance, compute the negative sum of the log-likelihoods of the correct start and end positions:

L = -logPstart(i|p,q) - logPend(j|p,q)

Averaging for all instances provides the fine-tuning loss

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Negative examples in contextual embeddings

A

Many datasets contain negative examples, that is, (q,p) pairs in which the answer to q is not in the passage p.

Negative examples are conventionally treated as having start and end indices pointing to the [CLS] special token.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Stanford attentive reader; bilinear product and attention

A

Stanford attentive reader is a neural model based on the RNN and it uses an attention-like mechanism.

Assume query q = q1, . . . , qN and passage p = p1, . . . , pM, where qt and pt’ are tokens.

Let e(qt), e(pt’) be non-contextual embeddings associated to tokens qt and pt’ respectively.

We use bidirectional LSTM to encode individual tokens from query and passage:

  1. We start with monodirectional embeddings: write them
  2. We concatenate monodirectional embeddings to encode individual passage tokens: write it
  3. We pick up boundary query embeddings to encode the entire query q: write it
  4. We derive an attention distribution by computing a vector of bilinear products and by applying softmax: write them
  5. We then use attention to combine passage tokens, and compute an output vector: write it
  6. Finally, the score of each candidate c is computed as the inner product: write it
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Machine reading system evaluation measures

A

Machine reading systems often evaluated using two metrics

  • Exact match: percentage of predicted answers that match the gold answer exactly
  • F1 score: compute precision and recall for system and gold answers, viewed as bag of tokens, and return average F1 over all questions.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Knowledge-based QA, entity linking

A

Text-based QA use textual information over the web (unstructured).

Knowledge-based QA answers a natural language question by mapping it to a query over some structured knowledge repository.

Two main approaches to knowledge-based QA:

  • Graph-based QA: models the knowledge base as a graph, with entities as nodes and relations as edges.
  • QA by semantic parsing: maps queries to logical formulas, and query a fact database.

Both approaches to knowledge-based QA require algorithms for entity linking.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Entity linking, mention

A

Is the task of associating a mention in text with the representation of some real-world entity in an ontology.

The most common ontology for factoid question-answering is Wikipedia, in which case the task is called wikification.

Entity linking is done in (roughly) two stages:

  • mention detection
  • mention disambiguation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly