Classification Flashcards
When is classification in modeling useful? (read: used)
When the response (outcome Variable) is nominal or categorial.
Must there be an order of the levels in the categorial response in order to use classification?
No, there must be no order in the levels of the categorial response, otherwise other techniques may and should be applied.
In classification, what is the vocabulary?
The vocabulary is the set of Classes C that the response value can be devided into. f.i. {yes, no} or {bus, tram, bike, train}
What does a classifier do?
It takes as an input the features (IV’s) and predicts what class (C) most likely belongs to categorial resonse Y.
What is Multi-Label Classification?
When the instances we test may be associated with more than one class.
f.i. an image containing a house and a dog may be classified as BOTH a house and a dog.
What is Hierarchical or Multilevel Classification?
The classes in set C can be divided into subclasses, so it hierarchical.
What is Structured Classification?
If there is structure in the input, there must also be a structure in the output.
f.i. each word of a sentence can be classed as {noun, verb, etc} but a sentence has to contain a certain structure to be comprehendable.
In Binaray or Multiclass Classification, how do we calculate Accuracy?
Where TP/TN means True Positive/Negative
Is accuracy a good metric for classification?
Only if both cases are well represented, this is because of imbalance.
In classification, what is imbalance?
If one response appears more frequent than the other responses in our data, there is an natural skew in the classifier towards classifying this response class.
In classification, what is recall and how do we calculate it?
fraction of predicted that are correct.
(TP/NF) = true positive/ negative false
In classification, what is precision and how do we calculate it?
fraction of predicted that are correct.
(TP/NF) = true positive/ negative false
In classification, what is the F-score?
And what does it’s value represent?
The F-score is an harmonic mean to balance Precision and Recall.
A high F-score means both P and R are high as well, generally a good thing.
For what types of classification can we use classification as regression?
Only binary classification. However for classification as logistic regression the multi levels can be seperated as well.
In classification as regression, what type of regression is useful?
Logisitc Regression, as the sponse space ensures the probabilty of the link space between the binary values of 0 and 1, hence the threshold is always crossed.