Deep nets overview Flashcards
What is the sigmoid function?
What are layers in a deep neural net?
What is the first layer called?
The input layer
What is the main rationale behind a deep net?
We can use the outputs as inputs for another layer
What is the last layer called?
the output layer
What we compare the targets to
What are the layers between the input and output layers called?
The hidden layers
What are the building blocks of hidden layers called?
Hidden units or hidden nodes
What is the width of the layer?
the number of hidden units in a hidden layer
What are two hyperparamaters?
Width, depth, learning rate
What are examples of paramaters?
Weights (w)
Biases (b)
What are the differences between Parameters and Hyperparameters?
Hyperparameters are pre-set by us
Parameters are found by optimizing the model
Why is non-linearity needed?
So we can create more complicated relationships
It also gives us the ability to stack layers
without it stacking layers is meaningless
we cannot stack layers when we only have linear relationships
in order to have deep nets and find complex relationships through arbitrary functions, we need non-linearities
In machine learning, what are non-linearities called?
Activation functions or transfer functions
What do activation functions do?
transform inputs into outputs of a different kind
What are the 4 common activation functions?
- Sigmoid (logistic function)
- TanH (hyperbolic tangent)
- ReLu (rectified linear unit)
- softmax