E7 Flashcards
1
Q
What is text mining?
A
Finding interesting information in texts
2
Q
Cleaning and preprocessing text
A
- Case normalization
- Removing punctuation
- Removing numbers
- Removing stopwords
- Word stemming and stem completion
3
Q
A token/term
A
e.g., a word or a group of words
4
Q
A document
A
One piece of text
5
Q
A corpus
A
A collection of documents