Activation+ Flashcards
Exam2
Should a linear function be used as the activation function in the hidden layers of a neural network?
No, because this is no different than linear regression
Can it be used at the outputlayer?
Yes if it is in a regression where the output can be positive or negative
What activation function is recommended for the hidden layer of neural networks because it speeds up training time?
ReLU (Rectified Linear Unit)
Can the sigmoid function be used in the hidden layer of a neural network?
Yes, but it converges slower than ReLU
What activation function should be used at the output layer of a model that predicts home prices?
ReLU
What activation function should be used at the output layer of a model that predicts temperatures?
Linear
What activation function should be used at the output layer of a model that recognizes images of 15 animals?
SoftMax
What activation function should be used at the output layer of a binary classification problem?
Sigmoid
The SoftMax output of a 4 class classification problem outputs [.1,.1,.1,.7] - what class is the prediction?
The class with the highest probability is considered the predicted class, which is 0.7
What role do activation functions play in creating the decision boundary of classification problems?
Allows you to find non-linear decision boundaries
Is [.1,.1,.1,.1] possible output of the SoftMax function with 4 classes?
YES, it is possible because this scenario indicates that the model has assigned an equal probability of 0.1 to each of the four classes, suggesting uncertainty or lack of confidence in making a specific prediction
Explain what the Adam optimizer does
The Adam Optimizer algorithm adapts the learning rate for each parameter, w. This leads to faster convergence and better performance compared to traditional optimization techniques with fixed learning rates