Text Mining Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

Definition of Text Mining

A

Text mining, also referred to as text data mining, roughly equivalent to text analytics, is the process of deriving high-quality information from text

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Text mining tasks

A

Typical text mining tasks include text categorization, text clustering, concept/entity extraction, production of granular taxonomies, sentiment analysis, document summarization, and entity relation modeling (i.e., learning relations between named entities).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Definition of Text Analytics

A

The term text analytics describes a set of linguistic, statistical, and machine learning techniques that model and structure the information content of textual sources for business intelligence, exploratory data analysis, research, or investigation.[1]

The latter term is now used more frequently in business settings while “text mining” is used in some of the earliest application areas, dating to the 1980s,[4] notably life-sciences research and government intelligence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Why text analytics

A

It is a truism that 80 percent of business-relevant information originates in unstructured form, primarily text.[5] These techniques and processes discover and present knowledge – facts, business rules, and relationships – that is otherwise locked in textual form, impenetrable to automated processing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

the state of text analytics technology and practice

A

Prof. Marti A. Hearst in the paper Untangling Text Data Mining:

I suggest that to make progress we do not need fully artificial intelligent text analysis; rather, a mixture of computationally-driven and user-guided analysis may open the door to exciting new results.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is natural language process (NLP)

A

Natural language processing (NLP) is a subfield of computer science, information engineering, and artificial intelligence concerned with the interactions between computers and human (natural) languages, in particular how to program computers to process and analyze large amounts of natural language data.

Challenges in natural language processing frequently involve speech recognition, natural language understanding, and natural language generation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Basic concept in NLP

A

lexical —- syntactic — semantic — inference

ambiguity is the killer

Robust and general NLP tends to be shallow while deep understanding doesn’t scale up.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what can we do with NLP

A

part of speech tagging >90% accuracy
parsing > 90% accuracy
the rest ???

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Text representation and enabled analysis

A

Text Rep Generality Enabled Analysis Examples of Application
String String processing Compression
Words Word relation analysis; topic analysis; sentiment analysis Thesaurus discovery; topic and opinion related applications
+ Syntactic structures Syntactic graph analysis Stylistic analysis; structure based feature extraction
+ Entities & relations Knowledge graph analysis; information network analysis Discovery of knowledge and opinions about specific entities
+ Logic predicates Integrative analysis of scattered knowledge; logic inference Knowledge assistant for

How well did you know this?
1
Not at all
2
3
4
5
Perfectly