Week 1 Flashcards
What is text retrieval
Having a collection of text documents.
Sub form of information retrieval.
Imperial problem evaluated by users.
What do search engines return
Relevant documents picked up by the TR systems to users
Text retrieval vs SQL retrieval
Free text vs structured data
Ambiguous vs rigorous semantics
Retrieve relevant docs vs matched records
What is vocabulary
V = { w1, w2 } - All words in the doc collection
What is query
Q
What is document
Di
What is collection
C = {d1, d2} - list of documents
What is word count
C(w, d) - counting frequency of word w in d
What is set of relevant documents
R(q) subset of C
What is TR task
R’(q) -> R(q)
List two TR strategies
Document ranking and document selection
What is document selection
R’(q) = { d E C |f(d,q) = 1 } where f(d,q) E {0,1}
Chosen or not
What is document ranking
R’(q) = { d E C |f(d,q) > 0 } where f(d,q) E R - R is relevance measure function