10 XML Retrieval Flashcards

1
Q

Structured retrieval

A

Search over structured documents

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

XML

A

Extensible markup language. A standard for encoding structured docu- ments. Most widely used standard.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

XML element

A

A node in a tree

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

XML attribute

A

An element can have one or more XML attributes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

XML DOM

A

Document Oject Model. The standard way for accessing and processing XML docs. The DOM represents elements, attributes, and text within elements as nodes in a tree. With a DOM API, we can process a XML document by starting at the root element and then descending down the tree from partens to children.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

XML Context/contexts

A

An XML path.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Schema

A

A schema puts constraints on the structure of allowable XML documents for a particular application. A shcema for ShakespearÕs plays may stipulate that scenes can only occur as children of acts and that only acts and scenes have the number attribute.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

NEXI

A

A common format of a XML query.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Structured document retrieval principle

A

The principle is as follows: A system should always retrieve the most specific part of a document answering the query.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Indexing unit

A

Which parts of a document to index.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Schema heterogeneity/diversity

A

In many cases, several different XML schemas occur in a collection because the XML documents in an IR application often come from more than one source. This is called schema heterogeneity. It presents yet another challenge(s):

  1. Comparable elements may have different names like author vs creator.
  2. The strucutral orginization of the scemas may be different: Author names are direct descendats of the node autor, but in a different struc- trure there can be firstname and lastname as dicrect children from author.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Extended query

A

We can support the user by interpreting all partent-child relationships in queries as descendant relationships with any number of intervening nodes allowd. These are extended queries.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Structural term

A

To index all paths that end in a single vocabulary term, in other words, all XML-context/term pairs. We call such an XML-context/term pair a structural term.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Text-centric XML

A

Where we match the text of the query with the text of the XML docuemnts 31

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Data-centric XML

A

Mainly encodes numerical and nontext attribute-value data. When quering data-centric XML, we want to impose exact match conditions in most cases. A query can be: ÒFind emplyees whose salary is the same this month as it was 12 months ago.Ó This query requires no ranking. It is purely strucutral and an exact matching of the salaries in the two time periods is probably sufficient to meet the users information need.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly