Neural Networks Flashcards
What is a neural network?
“… a neural network is a
massively parallel distributed processor that has a natural propensity for storing experiential knowledge and making it available for use. It resembles the brain in two respects:
◦
Knowledge is acquired by the network through a learning process.
◦
Inter-neuron connection strengths known as synaptic weights are used to store the knowledge”.
What is the structure of a neural network?
◦
Input layer: Input data, each variable goes into an input neuron.
◦
Hidden layer: Here is where learning occurs. They process the original patterns using “weights” (parameters!).
◦
In a shallow NN, we would expect one or two hidden layers.
◦
In a deep neural network, we would expect tens, hundreds, or even more!
◦
Each hidden layer learns more complex patterns!
◦
Output layer: This is the final layer, that takes the outputs from the hidden layer, and gives the needed output. e.g., in a binary classification example it would give the probability between o and 1.
◦
Transfer functions: Functions that collect the outputs from the previous layers and delivers them to the next.
Why are transfer functions important?
Transfer functions: Functions that collect the outputs from the previous layers and delivers them to the next.
The choice of transfer functions “shapes” the neural network.
◦
Outputs will depend on them.
◦
Speed of training is severely affected by it.
What is the Softmax function used for?
Extension of the logistic function to multinomial classes. When there are two classes, it reduces to the logistic (sigmoid!)
function. Common as output function of probability estimations!
What is the Hyperbolic Tangent Function?
Very common in shallow networks, specially for (bounded) regression!
f(x)=tanh(beta*x)
It is deemed less plausible than others, but it is very helpful when the activations both positive
and negative are likely.
◦ A regression.
Complex to optimize.
What is the Rectified Linear Unit (ReLU) Activation?
Very popular as a transfer function in Deep models!
◦ Reason: Easy to calculate, and in complex networks regularization occurs at a model level.
𝑓( 𝑥) = max (0, 𝛽𝑥)
Explain how loss functions work in neural networks
The final step in constructing a neural network is choosing the loss function, and the proper way
to optimize it.
Just
likein theothermodelswehaveusedso far, theneural networksuses standard losses.
◦
Binary classification problems: Binary cross-entropy.
◦
Multiclass classification problem: Categorical cross-entropy.
◦
Regression problem: Mean square error.
However, one major difference:
Neural networks are trained in batches!
◦
We use only a tiny sample of the training set (a batch) to make adjustments.
◦
A set of batches represent an epoch, or an iteration where the neural has “seen” a representative chunk of the training data.
◦
This method of training is a generalization of stochastic gradient descent.
What is backpropagation?
Given that a neural network is a sequence of simple functions, we can use the chain rule to “propagate
backwards” an error.
This is the idea behind “backpropagation”.
What is a convolutional neural network?
They try to mimic how we perceive patterns, and how we identify the world visually.
◦
Thus, they are very appropriate for image classification.
◦
However, we can adapt them to identify almost anything!
A
convolutional neural network is a NN built using convolutional layers.