Multilayer Perceptrons Flashcards

1
Q

Architecture-wise inspirations from biology

A
  1. simple elements
  2. massively parallel systems
  3. low precision and robustness
  4. distributed representation of information
  5. no separation between data and program
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Learning-wise inspirations from biology (inductive learning)

A
  1. data driven
  2. adaptive systems learning and self-organization vs. deduction and programming
  3. biologically inspired learning rules
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

The basic features of multilayer perceptrons

A
  1. The model of each neuron in the network includes a nonlinear activation function that is differentiable.
  2. The network contains one or more layers that are hidden from both the input and output nodes.
  3. The network exhibits a high degree of connectivity, the extent of which is determined by synaptic weights of the network.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Recurrent neural network application

A
  1. models of dynamic systems
  2. spatio-temporal pattern analysis
  3. sequence processing
  4. associative memory and pattern completion (Hopfield networks, Boltzmann machines, infinite impulse response (IIR) networks, long short-term memory networks (LSTM))
  5. Brain is a recurrent system (!)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Feed-forward neural networks application

A
  1. association between variables
  2. prediction of attributes (Multilayer-Perceptron (MLP), Radial Basis Function (RBF) network, Support Vector Machines (SVM))
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Prediction of attributes

A

Regression (find the best fit for one/several values) and classification (classify novel data to one of the attributes)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Error or cost functions

A

The cost function quantifies the cost of a wrong prediction and determines the “goodness” of a solution (the more wrong the prediction, the higher the cost).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Cost function types

A

Quadratic error function gives a balance of punishing very wrong answers, but forgiving small errors; linear EF is similar but punishes small error more and larger errors less that quadratic EF; maximum penalty caps the cost of the wrong prediction; sometimes, small errors can be tolerated and not punished at all. Quadratic error function is the maximum Likelihood estimator for Gaussian noise.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Generalization error

A

measures the performance of the model by finding the average error cost per prediction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Gradient descent

A

is an optimization method that interprets the training error as an error landscape over the model parameters w, so that an improvement of the model will be achieved by changing w in the direction opposite of the steepest gradient in the error landscape.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

The partial derivative of the individual cost can be split up via ______ into ___________

A
  1. applying the chain rule,

2. one term which depends on the cost/error function and a second term which depends on the model class

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

The main problems with the standard gradient descent

A
  1. It doesn’t always converge and does so quite slowly
  2. How to choose the correct step size on the gradient.
  3. Easily gets stuck in local minima.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Backpropagation

A

It is the repeated use of the chain rule and clever notation in the form of recursively defined local errors to obtain the following gradient. Backpropagation of errors is a computationally efficient method for calculating the derivatives required to determine the parameters of Multilayer-Perceptrons (MLPs) via gradient descent.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

The backpropagation phases (first)

A
  1. In the forward phase, the synaptic weights of the network are fixed and the input signal is propagated through the network, from parents to children, until it reaches the output. The function signals of the network are computed on a neuron-by-neuron basis. Thus, in this phase, changes are confined to the activation potentials and outputs of the neurons in the network. Forward propagation step: calculation of activities
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

The backpropagation phases (second)

A
  1. In the backward phase, an error signal is produced by comparing the output of the network with a desired response. The resulting error signal is propagated through the network, but this time the propagation is performed in the backward direction, from children to parents. In this second phase, successive adjustments are made to the synaptic weights of the network. The backward pass starts at the output layer by passing the error signals leftward through the network, layer by layer, and recursively computing the δ (i.e., the local gradient) for each neuron. This recursive process permits the synaptic weights of the network to undergo changes in accordance with the delta rule. Backpropagation step: calculation of ”local errors”
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Cross-validation

A
  1. split data set in n subsets Dj
  2. train the model on all patterns but with one subset missing
  3. evaluate on the subset
  4. repeat and average over results from different subsets
  5. when hyperparameters for a model have to be selected, this should also be done within the cross validation loop