W11 User Interaction Flashcards

Question 1

Q

how can the search engine learn from user interactions?

Answer

A

query modification behaviour (query suggestions)
interactions with documents (clicks)

Question 2

Q

query suggestions

Answer

A

goal: find related queries in the query log, based on
- common substring
- co-occurrence in session
- term clustering
- clicks

Question 3

Q

how can we use log data for evaluation?

Answer

A

use clicking and browsing behaviour in addition to queries:
- click-through rate: nr of clicks a document attracts
- dwell time: time spent on a document
- scrolling behaviour: how users interact with the page
- stopping information: does the user abandon the search engine after a click?

Question 4

Q

what are the limitations of query logs?

Answer

A

information need is unknown (can be partly deduced from previous queries)
relevance assessments unknown (deduce from clicks + dwell time)

Question 5

Q

learning from interaction data

Answer

A

implicit feedback, needed if we don’t have explicit relevance assessments

assumption: when the user clicks on a result, it is relevant to them

Question 6

Q

3 limitations of implicit feedback

Answer

A

noisy: a non-relevant document might be clicked or a relevant document might not be clicked

biased: clicks for reasons other than relevance - position bias: higher ranked documents get more attention
- selection bias: only interactions on retrieved documents
- presentation bias: results that are presented differently will be treated differently

what is the interpretation of a non-click? => either the document didn’t seem relevant or the user did not see the document

Question 7

Q

probabilistic model of user clicks

Answer

A

P(clicked(d)|relevance(d), position(d)) = P(clicked(d)|relevance(d), observed(d)) * P(clicked(d)|position(d))

Question 8

Q

how to measure the effect of position bias?

Answer

A

Idea: changing the position of a document doesn’t change its relevance, so all changes in click behaviour come from the position bias

intervention in the ranking:
1. swap two documents in the ranking
2. present the modified ranking to some users (A/B test)
3. record the clicks on the document in both original and modified rankings
4. measure the probability of a document being observed based on the clicks

Question 9

Q

how to correct for position bias?

Answer

A

Inverse Propensity Scoring (IPS) estimators can remove bias

Main idea: weigh clicks depending on their observation probability => clicks near the top get low weight, clicks near bottom get large weight

formula on slide 20, lecture 11

Question 10

Q

simulation of interaction

Answer

A

session simulation:
- simulate queries
- simulate clicks
- simulate user satisfaction

require a model of range of user behaviour
- users do not always behave deterministically
- might make non-optimal choices
- models need to contain noise

Question 11

Q

click models

Answer

A

How do users examine the result list and where do they click?

cascade assumption: user examines result list from top to bottom

Dependent Click Model (DCM)

Question 12

Q

Dependent Click Model (DCM)

Answer

A

1.users traverse result lists from top to bottom
2. users examine each document as it is encountered
3. user decides whether to click on the document or skip it
4. after each clicked document the user decides whether or not to continue examining the document list.
5. Relevant documents are more likely to be clicked than non-relevant documents

Question 13

Q

advantages of simulation of interaction

Answer

A

Investigate how the system behaves under certain behaviour
Potentially a large amount of user data
Relatively low cost to create and use
Enable the exact same circumstances to be replicated, repeated, re-used
Encapsulates our understanding of the process

Question 14

Q

disadvantages of simulation of interaction

Answer

A

Models can become complex if we want to mirror realistic user behaviour
Simulations enable us to explore many possibilities, but which ones, why, how to make sense of data?
Does it represent actual user behavior/performance?
What claims can we make? In what context?

Question 15

Q

query expansion

Answer

A

easy to experiment with in a live search engine because no changes to the index are required
can potentially examine multiple documents to aggregate evidence

Question 16

Q

document expansion

Answer

Study These Flashcards

A

documents are longer than queries, so more context for a model to choose appropriate expansion terms
can be applied at index time, and in parallel to multiple documents

Question 17

Q

Doc2Query

Answer

Study These Flashcards

A

document expansion: train a sequence-to-sequence model that, given a text from a corpus, produces queries for which that document might be relevant

train on relevant pairs of documents-queries
use model to predict relevant queries for docs
append predicted queries to documents

Question 18

Q

conversational search: different methods

Answer

Study These Flashcards

A

retrieval-based: select best response from a collection of responses

generation-based: generate response in natural language

hybrid: retrieve information, then generate response

Question 19

Q

1) pros of retrieval-based methods
2) cons of retrieval-based methods

Answer

Study These Flashcards

A

1)
source is transparent
efficient
evaluation straightforward

2)
answer space is limited
potentially not fluent
less interactive

Question 20

Q

1) pros of generation-based methods
2) cons of generation-based methods

Answer

Study These Flashcards

A

1)
fluent and human-like
tailored to user and input
more interactive

2)
not necessarily factual, potentially toxic
GPU-heavy
evaluation is challenging

Question 21

Q

how to evaluate conversational search methods?

Answer

Study These Flashcards

A

retrieval-based methods:
- Precision@n
- Mean Reciprocal Rank (MRR)
- Normalized Discounted Cumulative Gain (NDCG)

generation-based methods (measure word overlap):
- BLEU
- ROUGE (recall)
- METEOR

Question 22

Q

challenges in conversational search

Answer

Study These Flashcards

A

*coreference issues (referring back to earlier concepts)
*dependence on previous user and system turns
*explicit feedback
*topic-switching user behaviour

*logical self-consistency: semantic coherence and internal logic

*safety, transparency, controllability: difficult to control the output of a generative model (could lead to hate speach)

*efficiency: time and memory-consuming training and inference

Question 23

Q

ConvPR

Answer

Study These Flashcards

A

coreference issues (referring back to earlier concepts)
dependence on previous user and system turns
explicit feedback
topic-switching user behaviour

W11 User Interaction Flashcards

(23 cards)