Chapter 1 Introduction Flashcards

1
Q

why textual information is important?

A
  1. Human knowledge is recorded as text. example: scientific literature, text books, technical manuals;
  2. Most common type of people communication. example: email, text message,
  3. text is the most expressive form of information
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Text Mining includes two parts

A

Text retrieval and text mining

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Text retrieval

A

digital information beyond human being capacity to even skim over

The need for developing intelligent text retrieval system to help people get access to the relevant information quickly and accurately.
example: web search engine

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Text Mining

A

Due to the overwhelming amount of information, people need intelligent software tools to help discover the relevant knowledge to optimize decisions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

The goal of text mining

A

The goal of text mining is to discover knowledge and pattern in text data to support a user’s task.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Provenance

A

user usually need to go back to the original raw text data to obtain appropriate context and verify the trustworthiness of the knowledge.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

combined intelligence

A

how to optimally divide the work between human and machine so as to optimize the collaboration between humans and machines and maximize their “combined intelligence” with minimum human effort is a general challenge in all application of text data management and analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

TIS

A

Information Access (Text Retrieval)
Search engine fetch the information queried by users;
recommender system push relevant information to a user

knowledge Acquisition (Text Analysis, Text Mining)

Text Organization

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Text mining has two approaches, data mining perspective and natural language processing(NLP) approach. What is the difference between these two?

A

From data mining perspective, text data is seen as a special type of data. Following the goal of general data mining, the goal of text mining is to discover the patterns in text data.

From an NLP perspective, text mining can be regarded as to potentially understand natural language text, convert text into some form of knowledge representation and make limited inferences based on the extracted knowledge.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Some common functions for managing and analyzing text information

A

Search

Filter/Recommendation

Categorization: Classify a text object into one or several predefined categories where the categories can vary depending on applications

Summarization: news summarizer and opinion summarizer

Topic Analysis: when combined with the companion non-textual data such as time, location, authors and other meta data, topic analysis can generate many interesting patterns.

Information Extraction: construct entity-relation graphs

Clustering: Discover groups of similar text objects; enable quick understanding of large text data set

Visualization: visually display patterns in text data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly