cadc Flashcards
phonetics
Sounds that people use in language
phonology
systems of sounds in particular languages
morphology
how words are formed
syntax
how sentences are formed from words
semantics
what sentences mean
pragmatics
how language is used in context
tokenization
taking an input and a token type and splitting the input into pieces that correspond to the type
sparsity
when data contains a lot of zeros
accuracy
share of correct classifications overall
precision
probability of a positively coded document is relevant
recall
probability that a relevant document is coded positively
F1-Score
mean between precision and recall
supervised
have labeled data, train algorithm, teach algorithm and use on new data
unsupervised
let the algorithm figure out the labels and everything
independent variables
input features