Text Mining With R Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

What is a tibble?

A

A tibble is a modern class of data frame, available in dplyr and tibble packages. It has a convenient print method, will not convert strings to factors, do not use row names

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Token

A

A token is a meaningful unit of text, most often a word

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Tokenization

A

Tokenization is the process of splitting text (word) into tokens

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Which library

Which function

A

Single word, lower case, punctuation stripped, line number retained

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How to remove stop words

A

anti_join()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly