Quantitative text analysis Flashcards

1
Q

What is quant text analysis

A

Converting text into numerical values and use statistical analysis to identify patterns, trends & relationships within the text.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Computational text analysis

A

Using automated and semi-automated computational techniques to process, analyze & interpret textual data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

KWIK

A

Key words in context.
Return a list of the keyword, identifying the source text and the word index number within the source text.
In which context do key words appear?
Can be used to identify and extract paragraphs of interest.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Keyness

A

Compare the differential associations of keywords in a target and reference group.
Which words are used more by one group, relative to the other one?
The most common words are often similar, but the focus if on the words that distinguish between the two groups.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Lexical dispersion plot

A

Visualize the occurrences of particular terms throughout the text.

Not only how often the term is used, but also WHERE in the speech it is used

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Lexical diversity

A

Measure of how many different words are used in a text
How rich is the vocabulary?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Lexical density

A

Measure of the proportion of lexical items (i.e. nouns, verbs, adjectives and some adverbs) in the text.
How complex is the text itself?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Co-occurrence

A
  • Measuring co-occurrences of features within a user-defined context.
    A document
    A window within a collection of documents
    Can be plotted as a co-occurrence network.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a dictionary

A

Dictionary – exclusive – one feature linked to one key

Thesaurus – not exclusive – set of features linked to one key

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

LIWC: Linguistic Inquiry and Word Count

A
  • Uses a dictionary to calculate the percentage of words in the text that match each of up to 82 language dimensions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Coding scheme in dictionary

A

Hierarchy
First level - domain
Second level - subdomain
Other levels: may be additional sub-domains.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are dictionaries for

A

Describe the text
Measure expressed concepts in documents
Identify words that separate different categories, such as policy categories
Measure how often the categories apply in the text

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Supervised classification

A

Manually code a subset of the data
Use a supervised classifier to learn the relation between the words and the labels/categories.
Infer labels for the rest of the dataset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Unsupervised classification

A

Discover the main themes/topics in an unstructured corpus
- Infer hidden variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Structurual topic model

A

How are some covariates associated with the prevalence of topic usage?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly