Information Extraction Flashcards

Question 1

Q

task

Answer

A

activity of populating a structure information repository (database) from an unstructured/free text information source

Question 2

Q

the difference in strengths between information extraction and information retrieval

Answer

A

IR:

can search huge collections quickly
insensitive to genre & domain of texts
relatively straight forward to implement

IE:

extracts facts from texts, not just texts from text collection
resulting structured data source has many applications

Question 3

Q

Difference in weakness between IR and IE

Answer

A

IR:

returns documents not information/answers so further reading required
not discriminating enough

IE:
systems = genre/domain & porting new ones = time consuming/difficult
limited accuracy
computionally demanding

Question 4

Q

information extraction task

Answer

A

document collection and predefined set of entities, relations and events
return: structured representation of all mentions of specified entities, relations and or events

Question 5

Q

Named Entity Recognition

Answer

A

For each textual mention of an entity of one of a fixed set of types, identify its extent (position interval of the word in the text) and the type (organisation, person)

Question 6

Q

Types of entities

Answer

A

named individuals, named kinds (objects), times, measures

Question 7

Q

Coreference task

Answer

A

link together all different textual expression that refer to the same real world entity regardless of whether the surfare form

Question 8

Q

relation extraction

Answer

A

identify all assertions of relations, usually binary between entities identified in entity extraction divided into 2 subtasks

Question 9

Q

relation detection

Answer

A

find pairs of entities between which a relation holds

Question 10

Q

relation classification

Answer

A

for those pairs of entities, determine what the relation is

Question 11

Q

event detection & event classification

Answer

A

identify all reports of event instance, typically of a small set of classes divided into 2 subtasks

finds mentions of events in a text
assign detected events to one of a set of classes

Question 12

Q

knowledge engineering approaches

Answer

A

use manually authored rules and can be divided into:

deep: linguistically inspired language understanding systems.
shallow: systems engineered to the IE task, typically using pattern-action rules

Question 13

Q

supervised learning approaches

Answer

A

for each entity/relation in a given text, create a training instance represented in term of features so systems may learn patterns that match extraction targets and classifiers that classify tokens

Question 14

Q

bootstrapping approaches

Answer

A

minimally supervised systems are given seed tuples and/or seed patterns to search the text for:
- occurrences of seed tuples then extract a pattern that - - - matches the context of the seed tuples from which it harvests new tuples

Question 15

Q

distance supervision approaches

Answer

A

assumes a semi-structured data source which contains tuples of entities standing in the relation and a point to a source text

Question 16

Q

Evaluation of the IE system

Answer

Study These Flashcards

A

Keys = correct answers, produced manually for each extraction task
Responses = scoring of systems results vs keys done automatically

Information Extraction Flashcards

(16 cards)