cadc Flashcards

1
Q

phonetics

A

Sounds that people use in language

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

phonology

A

systems of sounds in particular languages

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

morphology

A

how words are formed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

syntax

A

how sentences are formed from words

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

semantics

A

what sentences mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

pragmatics

A

how language is used in context

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

tokenization

A

taking an input and a token type and splitting the input into pieces that correspond to the type

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

sparsity

A

when data contains a lot of zeros

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

accuracy

A

share of correct classifications overall

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

precision

A

probability of a positively coded document is relevant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

recall

A

probability that a relevant document is coded positively

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

F1-Score

A

mean between precision and recall

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

supervised

A

have labeled data, train algorithm, teach algorithm and use on new data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

unsupervised

A

let the algorithm figure out the labels and everything

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

independent variables

A

input features

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

dependent variables

A

output class

17
Q

overfitting

A

that an algorithm can predict a training data perfectly, but does not generalize to new data

18
Q

computational social science

A

field of social science that uses algorithmic tools and large/unstructured data to understand human and social behavior

19
Q

text analysis

A

a research technique for making replicable and valid inferences from texts (or other meaningful matter) to the contexts of their use

20
Q

feature creation

A

breaking down text into the features that we want to analyze

21
Q

feature transformation

A

involves text cleaning such as stopword removal

22
Q

feature selection

A

frequency trimming

23
Q

creation of structured data

A

e.g., a document-feature matrix (DTM)