Neural Networks Flashcards
Activation function
is a function that takes the weighted input of a neuron and produces an output, which is used as the input for the next layer of neurons. Its primary role: introduce non-linearities into the network and enable complex learning
Perceptron
a single layer neural network
Gradient Descent
Is an optimiser. A gradient is a vector. Each element in the vector represents how much the loss will change by changing the weight. The descent part is iteratively adjusting in the direction of the steepest descent of the loss function that leads to the global minimum.
Loss Function
Difference between target and actual function
Training a NN
Training is a process to find the optimal set of weights.
Bias
Error from erroneous assumprtion in the learning algorithm. From simple and underfitted models
Neurons
nodes or units, the basic units of a neural network for carrying out calculations
Weights
the numbers assigned to each connection between one neuron in a layer and the neurons from the previous layer
Bias (nn)
an additional parameter associated with each neuron that allows the model to better fit the data, provides neuron the flexibility to shift the activation function to the left or right
Softmax
maps the input to a probability distribution over a
set of possible outcomes, using an exponential function, commonly
used in the output layer of a neural network for multi-class
classification tasks
Forward Pass/Propagation
process of passing input data through the
network to obtain the output (predictions)
Optimisation
optimiser/algorithms used to adjust the weights of the
network to minimize the loss function.
Backpropagation define.
train NNs by adjusting the weights of connections,
efficiently calculates the gradient of the loss function with respect to each
weight in the network, allowing the model to learn by updating these
weights to minimize the error.
Automatic Differentiation
use computational graph to break down calculations into individual operations that are easier to analyse and manipulate. Forward mode is presumably for when u have few inputs. Reverse mode is efficient for functions with many inputs and few
outputs
Regularisation
NN
Early Stopping: interrupt training when its performance on the
validation set starts dropping
ℓ1 and ℓ2 Regularisation: modifying the cost function by adding λ 𝑊𝑡
Data Augmentation: generating new training instances from
existing ones, artificially boosting the size of the training set