Multilayer Perceptrons Flashcards

Question 1

Q

Architecture-wise inspirations from biology

Answer

A

simple elements
massively parallel systems
low precision and robustness
distributed representation of information
no separation between data and program

Question 2

Q

Learning-wise inspirations from biology (inductive learning)

Answer

A

data driven
adaptive systems learning and self-organization vs. deduction and programming
biologically inspired learning rules

Question 3

Q

The basic features of multilayer perceptrons

Answer

A

The model of each neuron in the network includes a nonlinear activation function that is differentiable.
The network contains one or more layers that are hidden from both the input and output nodes.
The network exhibits a high degree of connectivity, the extent of which is determined by synaptic weights of the network.

Question 4

Q

Recurrent neural network application

Answer

A

models of dynamic systems
spatio-temporal pattern analysis
sequence processing
associative memory and pattern completion (Hopfield networks, Boltzmann machines, infinite impulse response (IIR) networks, long short-term memory networks (LSTM))
Brain is a recurrent system (!)

Question 5

Q

Feed-forward neural networks application

Answer

A

association between variables
prediction of attributes (Multilayer-Perceptron (MLP), Radial Basis Function (RBF) network, Support Vector Machines (SVM))

Question 6

Q

Prediction of attributes

Answer

A

Regression (find the best fit for one/several values) and classification (classify novel data to one of the attributes)

Question 7

Q

Error or cost functions

Answer

A

The cost function quantifies the cost of a wrong prediction and determines the “goodness” of a solution (the more wrong the prediction, the higher the cost).

Question 8

Q

Cost function types

Answer

A

Quadratic error function gives a balance of punishing very wrong answers, but forgiving small errors; linear EF is similar but punishes small error more and larger errors less that quadratic EF; maximum penalty caps the cost of the wrong prediction; sometimes, small errors can be tolerated and not punished at all. Quadratic error function is the maximum Likelihood estimator for Gaussian noise.

Question 9

Q

Generalization error

Answer

A

measures the performance of the model by finding the average error cost per prediction

Question 10

Q

Gradient descent

Answer

A

is an optimization method that interprets the training error as an error landscape over the model parameters w, so that an improvement of the model will be achieved by changing w in the direction opposite of the steepest gradient in the error landscape.

Question 11

Q

The partial derivative of the individual cost can be split up via ______ into ___________

Answer

A

applying the chain rule,

2. one term which depends on the cost/error function and a second term which depends on the model class

Question 12

Q

The main problems with the standard gradient descent

Answer

A

It doesn’t always converge and does so quite slowly
How to choose the correct step size on the gradient.
Easily gets stuck in local minima.

Question 13

Q

Backpropagation

Answer

A

It is the repeated use of the chain rule and clever notation in the form of recursively defined local errors to obtain the following gradient. Backpropagation of errors is a computationally efficient method for calculating the derivatives required to determine the parameters of Multilayer-Perceptrons (MLPs) via gradient descent.

Question 14

Q

The backpropagation phases (first)

Answer

A

In the forward phase, the synaptic weights of the network are fixed and the input signal is propagated through the network, from parents to children, until it reaches the output. The function signals of the network are computed on a neuron-by-neuron basis. Thus, in this phase, changes are confined to the activation potentials and outputs of the neurons in the network. Forward propagation step: calculation of activities

Question 15

Q

The backpropagation phases (second)

Answer

A

In the backward phase, an error signal is produced by comparing the output of the network with a desired response. The resulting error signal is propagated through the network, but this time the propagation is performed in the backward direction, from children to parents. In this second phase, successive adjustments are made to the synaptic weights of the network. The backward pass starts at the output layer by passing the error signals leftward through the network, layer by layer, and recursively computing the δ (i.e., the local gradient) for each neuron. This recursive process permits the synaptic weights of the network to undergo changes in accordance with the delta rule. Backpropagation step: calculation of ”local errors”

Question 16

Q

Cross-validation

Answer

Study These Flashcards

A

split data set in n subsets Dj
train the model on all patterns but with one subset missing
evaluate on the subset
repeat and average over results from different subsets
when hyperparameters for a model have to be selected, this should also be done within the cross validation loop

Multilayer Perceptrons Flashcards

(16 cards)