Deep Learning Flashcards
What is the heart of deep learning?
Neural networks which are composed of neurons.
How is a neuron defined?
- Input = a vector of numeric inputs
- Output = a scalar
- Parameters
1) A vector of weights, one for each input plus a bias term (b)
2) An activation function f
What is a single-neuron neural network?
A perceptron
How to train a neural network?
With the perceptron algorithm, within which each iteration is termed an “epoch”.
What are some common non-linear activation functions?
- (Logistic) sigmoid
- Hyperbolic tan (“tanh”)
- Rectified Linear Unit
How to get neural network to be equivalent to logistic regression?
A neural network with a single neuron and a sigmoid activation.
What is the power of neural nets?
Stacking multiple neurons together in different ways.
- Layers of parallel neurons of varying sizes
- Feeding layers into hidden layers of varying sizes
e. g. a “fully-connected feed-forward neural network” takes the following form:
- the INPUT LAYER is made up of the individual features
- each HIDDEN LAYER is made up of an arbitrary number of neurons, each of which is connected to all neurons in the preceding layer, and all neurons in the following layer
- the OUTPUT LAYER combines the inputs from the preceding layer into the output
What would be the Output Layer Activation Function in a neural net if you wanted to do multiclass classification?
SOFTMAX
What would be the Output Layer Activation Function in a neural net if you wanted to do regression?
identify function or Sigmoid or tanh
What is the requirement for activation functions in Neural Nets?
They must be non-linear otherwise it’ll just collapse down to a linear regression model
Describe the Universal Approximation Theorem, what’s so good about it?
A feed-forward neural network with a single hidden layer (and finite neurons) is able to approximate any continuous function.
It’s good because it’s possible for a feed forward neural net with non-linear activation functions to learn any continuous basis function DYNAMICALLY, unlike SVMS eg, where the kernel is a hyperparameter
How to train a Neural Net with Hidden Layers?
Train neural nets with BACK PROPAGATION
- Compute errors at the output layer each weight using partial differentiation
- Propagate those errors back to each of the input layers
- STILL HAS A LEARNING RATE
CONS of Neural Nets?
Prone to chronic overfitting
- Due to large number of parameters
*Regularisation is critical
Or,
EARLY STOPPING, stop training when performance peaks on the dev. data
OR,
DROPOUT
Theoretical Properties of Neural Networks
- Can be applied to either classfification or regression
- Parametric
- Batch
- Relies on continuous features
- Assuming at least one hidden layer
- Complex to train, but produce relatively compact models
Why is a perceptron (which uses a sigmoid activation function) equivalent to logistic regression?
- A perceptron has a weight associated with each input
- The output is acquired by applying the activation function f(x) = 1 / (1 + e^-x) to the linear combination of inputs, which simplifies to f(sumof(wi * ai)) - this is the same as logistic regression function