Neural Networks Flashcards
What is a perceptron
an artificial neuron which takes in many input signals and produces a single binary output signal (0 or 1)
explain the role of bias terms in a NN
- bias terms add a level of flexibility and adaptability to the model.
- they “shift” the activation function, providing every neuron with a trainable constant value, in addition to the inputs
Perceptron convergence theorem
If the data is linearly separable and therefore a set of weights exist that are consistent with the data, then the Perceptron algorithm will eventually converge to a consistent set of weights
Perceptron cycling theorem
If the data is not linearly separable, the Perceptron algorithm will eventually repeat a set of weights and threshold at the end of some epoch and therefore enter an infinite loop
What is the main limitation of a Single Perceptron, also give a solution.
Can only handle linearly separable problems
Multi-layer perceptron
What is impressive about a MLP with just one hidden layer
the single hidden layer is enough to represent an approximation of any function to an arbitrary degree of accuracy
Why multilayer NN over a single layer?
- need for a wide architecture in shallow NNs
- shallow networks have a higher affinity for overfitting
What is the purpose of the cost function in a NN
Also known as the loss function, it quantifies the inconsistency between predicted values and the corresponding correct values
Explain the role of activation functions in NN
They play a crucial role by introducing non-linearities to the model, which are essential for enabling NN to learn complex patterns in the data
If you were to simply use the identity function (f(x) = x) as an activation function, what is the
class of functions that you will be restricted to, in terms of learning?
linear functions
Explain the difference between two common activation functions of your choice
Sigmoid vs TanH
1. Output Range:
- Sigmoid: (0,1): used for binary classification
- tanh: (-1, 1): suitable for zero-centred data
2. Symmetry:
- Sigmoid is asymmetric, biased towards positive values
- tanh is symmetric around the origin (0, 0)
Advantages of NNs
- Can learn and model non-linear and complex relationships.
- Works well when training data is noisy or inaccurate.
- Fast performance once a network is trained
Disadvantages of NNs
- Often require a large number of training examples.
- Training time can be very long.
- Network is like a “black box”. A human cannot look inside and easily understand the model or interpret the outputs
Give an example of a simple function that can’t be learned by a single perceptron
XOR, IOR (non-linear functions)
Briefly explain the role of the gradient descent algorithm in training neural networks
It plays a fundamental role in training NNs. It aims to minimise the error of the NN by adjusting the model parameters (weights and bias) based on the gradient of the cost function with respect to these parameters