Unit 1 Fundamentals of Deep Learning Flashcards
What is the Deep learning?
- Subset of machine learning that mimics the way the human brain functions
- Talk about how deep learning is similar to the human brain
Example of Deep learning:
How a toddler learns what is a dog
Types of neural networks used in deep learning:
- Convolution neural network(CNN)
- Recurrent neural network (RNN)
- Long short term memory (LSTM)
What is a Perceptron?
- Basic unit used to build ANN
- Takes real valued input(not only boolean)
- Calculates linear combination of these inputs and generates output
- If result obtained from a perceptron is greater than threshold output is 1 otherwise -1
What is a multilayer perceptron?
Like neurons in a human brain perceptrons form dense network with each other and this dense network is known as multilayer perceptron
What is a feed forward neural network?
- Type of neural network consisting of three types of layers:
1. Input layer
2. Hidden layers
3. Output layer - Inputs are passed on in the forward direction
Types of Feed forward network:
- Single layer feed forward network (2 layers)
- Multilayer feed forward network (3 layers)
What is Back-propagation?
- Used in multilayered feed forward network
- algorithm used to adjust the weights and biasis by minimizing the error
- error is back propagated to the input layer and each neuron adjust its weights and biases according to it
- uses gradient decent for finding out the optimal weights and biases
(BPN)
What are weights?
Define the strength of connections between each of the neurons
(BPN)
What are biases?
Additional parameter that shifts the activation function to the left or the right
What is gradient decent?
- an iterative optimization algorithm
- aim is to minimise the cost function or error between predicted and actual value
- to find local global minimum(point of convergence) of a differentiable function
Process of Gradient descent:
- Arbitrary point selected
- Calculate slope at the point
- Update weights and bias yielded by slope
- Updation leads the slope to get steeper untill it reaches minimum value
- Large and small learning rate
Solutions to vanishing gradient decent:
- ReLU
- LSTM (constant gradient size)
- Gradient clipping
- Residual neural network
- Multi level hierarchy
What is vanishing gradient problem?
The slope of the gradient decent becomes steeper and stepper and reaches of minimum of almost 0
What is the activation function?
Desides whether neuron will fire or not for a given set of inputs, depending on a rule or threshold
It can be called as a mathematical gate between a input feeding the current neuron and the output going to the next layer
What is ReLU?
Rectified linear unit
Most commonly used activation function which is computer allows neuron to respond quickly
Characteristics of ReLU:
- It is nonlinear
- It is continuously differentiable
- It supports back propagation
- It does not have a fixed output range
- It is not zero-centred
- It does not have vanishing gradient problem
Drawback of ReLU:
Dying ReLU problem
LeakyReLU
Modification of ReLU which replaces is the zero output with negative slope
What is exponential ReLU?
It is very similar to ReLU but has an extra alpha constant due to which the ELU becomes smooth slowly until it is -alpha
What are Hyperparameters?
That contribute to the architecture of the network. That are set before training process. Top level parameters that control learning process. Hyper parameters are not updated for the training process the value remains constant till the end
Types of Hyperparameters:
- Layer size
- Learning rate
- Momentum
What is regularisation?
A technique used to reduce the overfitting of a model and penalize any complexity produced (large weight matices). Cost function updated by adding a regularisation term to it
What is regularisation parameter?
Lambda hyperparameter is optimised for better results
What are the types of regularisation?
L1 - generalizing the absolute value of weights
L2 - weight decay, forces weights to decay toward zero (but not exactly zero)
What is drop out regularisation?
Regularisation method in which intentionally random neurones from neural network are removed
What is drop connect regularisation?
Generalisation of dropout where few random individual weights are disabled instead of disabling the whole node. So that node remains partially active