BIS II - Introduction to ANN Flashcards
ANN Applications:
- Spam detection: e.g. in emails
- Times series prediction, e.g. in the financial domain
- Pattern recognitions, e.g. objects, faces, etc.
- Computer games, e.g. Go
Basic concept of ANN
- Computer technology that attempts to build computers
- That will operate like a human brain
- The machines possess simultaneous memory storage and works with ambiguous information
- Perceptron: early neural network structure that uses no hidden layer
Basic concepts of Neural Networks – A Neuron
- Neurons: cells (processing elements) of a biological or artificial neural network
- Nucleus: the central processing portion of a neuron
- Dendrite: the part of a biological neuron that provide inputs to the cell
- Axon: an outgoing connection (i.e. terminal) from a biological neuron
- Synapse: the connection (where the weights are) between processing elements in a neural network
Information Processing in ANN – Process Steps
Different inputs are weighted: input consists of the output of the sending unit and the weight between the sending and receiving units
Propagation function: propagation (summation) function determines how the net input is computed; usually a linear combination is used:
Transformation function (also: activation/transfer function)
- Computes the internal stimulation (activity level) if the neuron. Based on this level, the neuron may or may not produce an output (fire)
- The relationship between the internal activation level and the output can be linear or nonlinear
5 types of activation functions: 1. Linear activation function 2. Linear threshold function 3. Binary threshold function 4. Sigmoid activation functions • Logistic • Tangens hyperbolicus 5. Normally distributed activation function
Output:
- Sometimes a threshold function is used
- Most software packages do not distinguish between activation level and output function
Connection Weights:
- Are associated with each link in a neural network model and express the relative strength of the input data
- Are assessed by neural networks learning algorithms
Further Concepts in ANN -Topologies
- Topologies: the type how neurons are organized in a neural network
- Network structure: (three layers)
1. Input layer
1. Each input corresponds to a single attribute, e.g. income level, age, etc.
2. Several types of data can be used
3. Preprocessing may be needed to convert the data into meaningful inputs
2. Hidden layers
1. The middle layer of an ANN, that has three or more layers
2. Each layer increases the training effort exponentially
3. Output layer
1. Contains the solution to a problem
2. Purpose of the network is to compute the output values
Basic Concepts of ANN: NN architectures
- Feedforward with backpropagation
- Associative memory
- Recurrent network
- Kohonen’s self-organizing maps
- Hopfield networks
Learning in ANN: Learning Process
- Learning:
1. Done by comparing computed outputs to desired outputs of historical cases
2. Is defined as a change of wrights between units - The process of learning involves three tasks:
1. Compute temporary outputs
2. Compare outputs with desired targets
3. Adjust the weights and repeat the process if the desired output is not achieved
Basic Concepts of Learning in ANN
- Incremental Learning
- Batch Training
Types of Learning
- Supervised learning: method of training ANN in which sample cases are shown to the network as input and the weights are adjusted to minimize the error in its outputs
- Unsupervised learning: a method of training ANN in which only input stimuli are shown to the network, which is self-organizing
- Reinforcement learning: in contrast to supervised learning the outputs are not shown to the network but the feedback whether the output was correct or not
- Direct design methods (hardwired systems): the weights are not adjusted. There is no learning in terms of weights modification
Time of learning:
Incremental learning:
- Weights are adjusted after each sample case
- Also called pattern by pattern, online learning.
Batch training:
- Weights are adjusted after all sample cases have been put into ANN
- Also called epoch/offline learning
Learning rules of ANN
- Learning algorithm: the training procedure used by ANN.
- Overview over learning rules in ANN:
- Delta rule
- Gradient descent
- Backpropagation
- Hebbian rule (what fires together, wires together)
- Competitive learning
Delta Rule & Propositional Calculus
- Special form of steepest gradient descent approach
- Also called Widrow-Hoff rule (1960) or Least Mean Square rule
- Based in comparison of desired and calculated output
Learning in ANN: XOR function and linear separability
- A single neuron (perceptron) represents a hyperplane in instance
- Linear separability is shown in the graphic +
- A single perceptron is not sufficient to build an XOR classifier -> multilayer perceptron is needed
- Three operations AND; OR and NOT can be represented using a perceptron -> any expression from propositional calculus can be converted into a multilayer perceptron, e.g. 1XOR2 = (1OR2) AND (1AND2)
Learning in ANN: Backpropagation
Delta rule apply for ANN without hidden layers
- For some problems, ANN with hidden layers are needed, e.g. for solving the XOR problem
- Problem: the desired activation levels (outputs) of hidden layers are unknown
- Backpropagation:
1. The error (similar to delta rule) is propagated back
2. But the calculation of the weight changes for hidden layers is also possible
Steps in Backpropagation:
- Initialize weights with random values and set other parameters
- Read in the input vector and the desired output
- Compute the actual output via the calculations, working forward through the layers (Forward-pass)
- Compute the error
- Change the weights by working backward from the output layers through the hidden layers (Backward-Pass)
Learning in ANN: Gradient Descent
- Limitations & Workaround
- Find the combination of all weights w, so that the sum of the squared error F is minimized
- Problem: high computational complexity in high dimensional spaces
- Solution: steepest gradient descent method
- The gradient is a n-1-dimensional vector of partial derivatives
- The negative gradient gives the direction where to move in the next iteration
- Premise for usage: differentiable propagation, activation, and output functions
- Limitations of gradient descent:
1) Local minimum
2) Plateaus
3) Skipping of minima
4) Direct oscillation
5) Indirect oscillation
6) Saddle points
Workaround for Limitations of Gradient Descent
- Workaround for limitations:
1. Change initial weights
2. Change the starting point of the gradient descent approach
3. Change the type of initialization
4. Change learning parameters
1. Increase learning rate
2. Decrease learning rate
3. Vary learning rates
5. Define different learning rates for different layers
6. Insert momentum parameter
7. Apply simulated annealing
Development Process of an ANN
- Data Collection and preparation: data used for training and testing must include all the attributes that are useful for solving the problem
- Separate into training and test set: split available dataset into two subsets, one used for learning, the other for testing the learned model
- Selection of network structure: selection of a topology, i.e. the way in which neurons are organized in a neural network. Determination of:
o Input nodes & output nodes
o # of hidden layers & # of hidden nodes - Learning algorithm selection: identify a set of connection weights that best cover the training data and have the best predictive accuracy
- Set parameters and values, initialize weights: set parameters such as weights, learning rate, etc.
- Transform data into network outputs: define transfer function, e.g. sigmoid
- Network training: an iterative process that starts from a random set of weights and gradually enhances the fitness of the network model and the known data set. The iteration continues until the error sum is converged to below a preset acceptable level-
- Testing:
o Comparing test results to actual results
o Text plan should include routine cases and potentially problematic situations
o If the testing reveals large deviations, the training set must be reexamined, and the training process may have to be repeated - Implementation of an ANN:
o Often requires interfaces with other computer-based information systems and user training
o Ongoing monitoring and feedback to the developers are recommended for system improvements and to ensure that the system is accepted and used properly