C11: user interaction Flashcards
how can the search engine learn from user interactions?
- query modification behaviour (query suggestions)
- interactions with documents (clicks)
query suggestions
goal: find related queries in the query log, based on
- common substring
- co-occurrence in session
- term clustering
- clicks
how can we use log data for evaluation?
use clicking and browsing behaviour in addition to queries:
- click-through rate: nr of clicks a document attracts
- dwell time: time spent on a document
- scrolling behaviour: how users interact with the page
- stopping information: does the user abandon the search engine after a click?
what are the limitations of query logs?
- information need is unknown (can be partly deduced from previous queries)
- relevance assessments unknown (deduce from clicks + dwell time)
learning from interaction data
implicit feedback, needed if we don’t have explicit relevance assessments
assumption: when the user clicks on a result, it is relevant to them
3 limitations of implicit feedback
noisy: a non-relevant document might be clicked or a relevant document might not be clicked
biased: clicks for reasons other than relevance
- position bias: higher ranked documents get more attention
- selection bias: only interactions on retrieved documents
- presentation bias: results that are presented differently will be treated differently
what is the interpretation of a non-click? => either the document didn’t seem relevant or the user did not see the document
probabilistic model of user clicks
P(clicked(d)|relevance(d), position(d)) = P(clicked(d)|relevance(d), observed(d)) * P(clicked(d)|position(d))
how to measure the effect of position bias?
Idea: changing the position of a document doesn’t change its relevance, so all changes in click behaviour come from the position bias
intervention in the ranking:
1. swap two documents in the ranking
2. present the modified ranking to some users (A/B test)
3. record the clicks on the document in both original and modified rankings
4. measure the probability of a document being observed based on the clicks
how to correct for position bias?
Inverse Propensity Scoring (IPS) estimators can remove bias
Main idea: weigh clicks depending on their observation probability => clicks near the top get low weight, clicks near bottom get large weight
formula on slide 20, lecture 11
simulation of interaction
session simulation:
- simulate queries
- simulate clicks
- simulate user satisfaction
require a model of range of user behaviour
- users do not always behave deterministically
- might make non-optimal choices
- models need to contain noise
click models
How do users examine the result list and where do they click?
cascade assumption: user examines result list from top to bottom
Dependent Click Model (DCM)
Dependent Click Model (DCM)
- users traverse result lists from top to bottom
- users examine each document as it is encountered
- user decides whether to click on the document or skip it
- after each clicked document the user decides whether or not to continue examining the document list
- relevant documents are more likely to be clicked than non-relevant documents
advantages of simulation of interaction
- Investigate how the system behaves under certain behaviour
- Potentially a large amount of user data
- Relatively low cost to create and use
- Enable the exact same circumstances to be replicated, repeated, re-used
- Encapsulates our understanding of the process
disadvantages of simulation of interaction
- Models can become complex if we want to mirror realistic user behaviour
- Simulations enable us to explore many possibilities, but which ones, why, how to make sense of data?
- Does it represent actual user behavior/performance?
- What claims can we make? In what context?
query expansion
expand the query with more similar terms: easy to experiment with in a live search engine because no changes to the index are required