Basic Concepts in Machine Learning Flashcards

1
Q

Learning

A

Learning is the acquisition of new information or knowledge or the process to acquire knowledge or skill by systematic study or by trial and error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is Machine Learning?

A

Machine learning is โ€œthe field of study that gives computers the ability to learn without being explicitly programmedโ€

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

A machine learning system is comprised of the following four components:

A
  • Dataset ๐’ฎ: a set of samples generated by some system or process; the
    samples can be single data points or pairs of input and output values
  • Model โ„ณ: an adjustable and compact representation of a certain class of
    input/output relationships that is hypothesized to be capable of modeling the
    system or process which generates ๐’ฎ
  • Objective Function โ„’: a function that encodes the current performance of โ„ณ
    (e.g. loss or reward)
  • Algorithm ๐’œ: the learning algorithm that adjusts โ„ณ based on ๐’ฎ and โ„’
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Machine learning is an important prerequisite for the implementation of a broad range of cognitive functions in artificial cognitive systems:

A
  • Learning and Development: modeling and implementation of biological
    learning mechanisms (operant conditioning, implicit learning, explicit
    learning, perception etc.)
  • Memory, Knowledge, and Internal Simulation: modeling and implementation
    of the encoding, storage and retrieval of facts, experiences, and actions
    (e.g. associative memory)
  • Perception: learning basic features to detect and categorize perceptual stimuli
    (e.g. unsupervised learning of visual features)
  • Autonomy: dynamic adaption to changes in the environment (e.g. continuous
    online learning from a live data stream
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Practical Applications of Machine Learning Examples

A
  • Image classification
  • Speech recognition
  • Autonomous driving
  • Recommendation systems
  • Threat protection
  • Control systems
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Definition of the Machine Learning Task

A

Train a model โ„ณ in a hypothesis space โ„‹ using a learning algorithm ๐’œ so that
โ„ณ minimizes loss โ„’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Types of Machine Learning

A

Unsupervised Learning
* Solely unlabeled data
* Discovery of structural features in the data set

Reinforcement Learning
* Interaction with the
environment
* Reward signal encodes feedback for the policy

Semi-Supervised Learning
* Labeled and unlabeled training samples
* A priori assumptions on input data required

Supervised Learning
* All training samples are labeled
* Desired output is
specified exactly

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Combining Hypotheses to Ensembles

A

Ensemble methods in machine learning are a simple way to extend hypothesis
spaces by combining a set of hypotheses โ„Ž1, โ„Ž2, โ€ฆ, โ„Ž๐‘› โˆˆ โ„‹ to a new hypothesis
โ„Žโˆ— โˆˆ โ„‹๐‘›.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Boosting

A

Boosting algorithms compute a strong learner by incrementally constructing an
ensemble of hypotheses:
* Every training sample ๐‘ ๐‘– โˆˆ ๐’ฎ is assigned a weight ๐‘ค๐‘–; initially, all weights are
set to the same value
* Weights of incorrectly learned samples are increased
* The training of new hypotheses focusses on samples with high weights

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Underfitting vs. Overfitting

A
  • Underfitting: โ„Ž fits the training data poorly and does not model the underlying process because โ„‹ is not expressive enough
  • Overfitting: โ„Ž fits the training data very well but does not model the underlying process because it does not generalize well
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Generalization

A

predictive performance of โ„Ž on data that were not considered
during the training phase

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Occamโ€™s Razor

A

Of two competing theories, the simpler explanation of an
entity is to be preferred

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Generative and Discriminative Models

A
  • Discriminative models are based on the posterior probabilities P(๐‘ฆ|๐‘ฅ)
  • Generative models are based on the prior probabilities P(๐‘ฅ| ๐‘ฆ) ; predictions can be computed by applying Bayesโ€™ theorem: P๐‘ฆ๐‘ฅ =P(๐‘ฅ|๐‘ฆ)P(๐‘ฆ)/P(๐‘ฅ).
    Generative models are compact representations of the training data that have considerably less parameters than the dataset ๐’ฎ
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Training Validation and Test

A
  • Training set: the samples used in the training phase by the learning algorithm to search for a hypothesis โ„Ž in the hypothesis space โ„‹
  • Validation set: a set of samples that are used to assess the performance of a
    hypothesis โ„Ž that was computed in the training phase; based on the performance of โ„Ž, the parameters of the training phase can be adjusted
  • Test set: a set of samples (or real-world data) that is used to assess the performance of the final model
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Cross-Validation

A
  • The dataset is partitioned into k subsets and learned in k iterations
  • In every iteration, a different subset is selected as validation set
  • The overall performance corresponds to the averaged performances of the k
    iterations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Avoidance of Overfitting techniques

A

Regularization
Overfitting is only possible with hypotheses โ„Ž that are complex enough to capture statistical features that do not explain the data (e.g. noise). A regularization term in the objective function โ„’ guides the learning process
towards simpler solutions by punishing complexity.

More Training Data Overfitting can be reduced by increasing the size (i.e. the complexity) of the
dataset.

Dataset Augmentation If not enough training data is available, the size of the dataset can be increased
by applying transformations to the training samples (i.e. add noise, apply shifts,
translations, and rotations, etc.).

17
Q

The Perceptron

A

The perceptron is a linear classifier that is based on a single neuron with a digital threshold function

18
Q

Perceptron Learning Rule Properties

A
  • If a solution exists, i.e. if the data set is linearly separable, then the perceptron learning algorithm finds a solution within a finite number of steps (perceptron convergence theorem)
  • The solution computed by the algorithm depends on the initialization of the
    parameters and the order of presentation of the training samples
  • The algorithm does not converge for not linearly separable data sets
19
Q

Interpolation vs. Regression

A

The interpolation function ๐‘“(โ‹…) must
be consistent with ๐’ฎ: ๐‘“
๐‘ฅ๐‘– =๐‘ฆ๐‘–.

The hypothesis โ„Ž(โ‹…) should minimize โ„’
and generalize well to new samples.

20
Q

Classifying Multiple Classes Problems

A

One-Versus-the-Rest Classifier (Top)
* Separation of ๐พ classes with ๐พ โˆ’ 1 binary discriminant functions
* Every discriminant function separates one class from all others

One-Versus-One-Classifier (Bottom)
* Pairwise separation of ๐พ classes with K
๐พโˆ’1 /2 binary
discriminant functions
* The class of a data sample is assigned by a majority vote

In both cases, some regions are classified ambiguously!

21
Q
A