6. Classification Flashcards

Question 1

Q

What are the components of a classification system?

Answer

A

Components:

Sensing module
Preprocessing module
Feature extraction mechanism
Classifier (with training set if it uses supervised learning)

Question 2

Q

What are features and a features space? What makes a good feature?

Answer

A

Features are measurable quantities that help distinguish between two different classes. The discriminatory power is a property which a feature can or cannot have; more = better.

A feature space is a set of all features that can be represented graphically in a scattered plot.

How to know if the selected features are valuable?

If the classes create clusters in the feature space that are:
- Far apart, then right features have been selected
- Close together or even overlapping, then selected features are useless

Question 3

Q

What are supervised vs unsupervised learning classifiers?

Answer

A

Supervised classification - relies on having a set of examples (feature vectors) whose true class is known (ML!)

- Outputs are called targets and are also known
- Training patterns(training feature vectors) are examples used to train the ML algorithm
- Based on the training set, the algorithm generalizes to respond correctly to all possible inputs
- Exs: template matching, decision trees, neural networks, naive Bayes

Unsupervised classification - does not rely on the possession of examples from a known class

- How? identifies similarities between the inputs so that similar inputs are categorized together
- Ex: clustering techniques

Question 4

Q

What features are used for image classification?

Answer

A

There is no perfect, universal set of features for image recognition, however:
- Geometric moments can be used as features for BLOBs classification so long they are able to distinguish well between different classes of objects.

However, they only work when the training patterns and the newcomer are of the same size and orientation.

Question 5

Q

What features are used for speech recognition?

Answer

A

There is no standard set of features suitable to distinguish between certain words for all ASR (Automatic Speech Recognition Systems) applications.

However, features like:
- the frequency spectrum, calculated using FFT on time-domain signals

spectrograms
Mel-frequency Cepstral Coefficients (MFCC), a Fourier spectrum of the logarithmic amplitude spectrum of a signal; a spectrum of a spectrum

Question 6

Q

What are mel cepstrum coefficients?

Answer

A

The elements in a vector from the output of a MFCC. The first 12 to 13 MFCC coefficients form the feature vector.

Question 7

Q

What are the principles of rule based classifiers? Give examples

Answer

A

Decision rules that distinguish between different classes are a collection of if…then statements. The general rule is: Condition -> y

Ex:
if(hasHair==yes) -> Mammal
if(hasFeathers==yes) -> Bird
if(hasScales==yes) && (hasGills==yes) -> Fish
if (hasScales==yes) && (hasGills==no) -> Reptile

Question 8

Q

What are the principles of template matching? (distance, examples) What are its advantages and disadvantages?

Answer

A

Each class is represented by a template (a single reference pattern) which are stored in a database.

Adv: Very simple and effective

Disadv: Computationally intensive + Less simple when inputs and templates don’t have the same size or when other distortions of characters occur (ex: translation and rotation)
- Improvement: use in combo with other features

Question 9

Q

What are neural networks? What do their diagrams look like? How do they work as classifiers? Give an example

Answer

A

A (artificial) neural network (NN) is a computer learning system inspired from biology which makes use of a structure of neurons allowing classification between multiple classes.

A diagram of a NN consists of an input layer, which has a fixed weight of importance, and output layer, of just 1s and 0s. Multilayer nns simply have hidden layers; no direct connection to the outside world.

It works by:

Input layer passes signals on and output layer consists of McCulloch and Pitts neurons that perform calculations
Activation function is a threshold activation function

Question 10

Q

What is the essence of Bayes approach?

Question 11

Q

How do you calculate P(A/B) using Bayes rules?

Answer

A

conditional probability that event A is true given that B is true, where A and B are random events

Question 12

Q

What are the principles of the naïve bayes classifier?

Answer

A

Bayesian classifiers try to infer conditional probability that a newcomer belongs to a certain class given its features vector
    -   Ex: calculating the probability that "it is a bird" given that "it is yellow and it can fly"

Naïve: Assumption (but not always the case): Features are mutually independent + All attributes that influence a classification decision are observable and represented
- Ex: no relation between Reptile and Mammal classes if classifying animals

Question 13

Q

How do Hidden Markov Models work as classifiers?

Answer

A

A graphical, Bayesian probabilistic reasoning algorithm suitable for modelling dynamic systems (ex: speech recognition, location, activity recognition systems)

A system is at moment t in state x(t) but at the next clock tick, it jumps into state x(t+1) that can be the same state or another one. The state depends only on its previous state (not older ones) and can be not visible.

Question 14

Q

What is the principle to unsupervised learning? What is clustering?

Answer

A

Unsupervised classification - does not rely on the possession of examples from a known class

- How? identifies similarities between the inputs so that similar inputs are categorized together
- Ex: clustering techniques

Question 15

Q

What is the principle to reinforcement learning?

Question 16

Q

What is a confusion matrix? Give an example (TP, TN, FP, FN)

Answer

Study These Flashcards

A

The confusion matrix is a tool used to summarize results of a classification test.
TP: True-positive Rate aka Detection/Hit Rate (The proportion of positive instances that are correctly classified as positive)
FP: False-positive Rate aka False Alarm Rate (The proportion of negative instances that are erroneously classified as positive)

6. Classification Flashcards

(16 cards)