Information Extraction Flashcards

1
Q

task

A

activity of populating a structure information repository (database) from an unstructured/free text information source

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

the difference in strengths between information extraction and information retrieval

A

IR:

  • can search huge collections quickly
  • insensitive to genre & domain of texts
  • relatively straight forward to implement

IE:

  • extracts facts from texts, not just texts from text collection
  • resulting structured data source has many applications
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Difference in weakness between IR and IE

A

IR:

  • returns documents not information/answers so further reading required
  • not discriminating enough

IE:
systems = genre/domain & porting new ones = time consuming/difficult
limited accuracy
computionally demanding

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

information extraction task

A

document collection and predefined set of entities, relations and events
return: structured representation of all mentions of specified entities, relations and or events

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Named Entity Recognition

A

For each textual mention of an entity of one of a fixed set of types, identify its extent (position interval of the word in the text) and the type (organisation, person)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Types of entities

A

named individuals, named kinds (objects), times, measures

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Coreference task

A

link together all different textual expression that refer to the same real world entity regardless of whether the surfare form

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

relation extraction

A

identify all assertions of relations, usually binary between entities identified in entity extraction divided into 2 subtasks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

relation detection

A

find pairs of entities between which a relation holds

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

relation classification

A

for those pairs of entities, determine what the relation is

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

event detection & event classification

A

identify all reports of event instance, typically of a small set of classes divided into 2 subtasks

finds mentions of events in a text
assign detected events to one of a set of classes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

knowledge engineering approaches

A

use manually authored rules and can be divided into:

  • deep: linguistically inspired language understanding systems.
  • shallow: systems engineered to the IE task, typically using pattern-action rules
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

supervised learning approaches

A

for each entity/relation in a given text, create a training instance represented in term of features so systems may learn patterns that match extraction targets and classifiers that classify tokens

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

bootstrapping approaches

A

minimally supervised systems are given seed tuples and/or seed patterns to search the text for:
- occurrences of seed tuples then extract a pattern that - - - matches the context of the seed tuples from which it harvests new tuples

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

distance supervision approaches

A

assumes a semi-structured data source which contains tuples of entities standing in the relation and a point to a source text

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Evaluation of the IE system

A
Keys = correct answers, produced manually for each extraction task
Responses = scoring of systems results vs keys done automatically