Chapter 5 Flashcards
Framework
a set of pre-written code that helps you accomplish a particular task or set of tasks. It provides a structure and set of conventions for developing applications or libraries, and can save you time and effort by handling common tasks
Keras
an open-source deep learning framework written in Python. It provides a user-friendly interface for building and training deep neural networks, and is designed to be easy to use and intuitive. Keras is built on top of other popular deep learning frameworks such as TensorFlow and Theano, and allows you to quickly prototype and experiment with different architectures and models.
MNIST
a large database of handwritten digits that is commonly used as a benchmark for image classification tasks in machine learning. It contains 60,000 training images and 10,000 test images, each of which is a 28x28 pixel grayscale image of a handwritten digit. MNIST is often used to evaluate the performance of different image classification models.
Saturated Neuron
a neuron whose output has been pushed to the extreme ends of their activation range, usually due to a large input. When a neuron is saturated, its output no longer changes with further increases in input. This can cause problems for the network during backpropagation, as the gradients may become very small or vanish, making it difficult for the network to learn and update its weights effectively. Techniques such as weight initialization, regularization, and activation function selection can be used to prevent or mitigate the issue of saturated neurons.
Vanishing Gradient
a phenomenon that can occur during training of deep neural networks, where the gradients that flow back through the network during backpropagation become extremely small as they propagate towards the earlier layers of the network. As a result, the weights of these earlier layers are updated very slowly or not at all, which can cause the network to fail to learn useful features and perform poorly on the task it is being trained on. This problem is often caused by the use of activation functions with very small gradients, such as the sigmoid function, and can be mitigated through the use of alternative activation functions, careful initialization of the weights, and other techniques such as skip connections and batch normalization.
Weight Initialization
the process of setting the initial values of the weights in a neural network. Proper weight initialization is important for the effective training of deep neural networks, because the values of the weights can affect the behavior of the network during training. Poorly initialized weights can cause the network to converge to a suboptimal solution, slow down training, or even prevent the network from training altogether.
Input Standardization
a preprocessing technique used to normalize the input data before it is fed into the neural network. The goal of input standardization is to improve the stability and speed of the training process by making the input data more suitable for the neural network.
Batch Normalization
a technique to improve the stability and speed of training by normalizing the activations of the neurons in each layer of the neural network. The basic idea behind batch normalization is to center and scale the outputs of the neurons, using the mean and standard deviation computed over a batch of training examples.
Cross-entropy loss function
a commonly used loss function that measures the difference between the predicted probability distribution and the actual probability distribution of the target variable.
Activation Function
An activation function is a mathematical function that is applied to the output of each neuron in a neural network to introduce non-linearity into the model. The purpose of the activation function is to allow the neural network to learn complex patterns in the data that cannot be represented by a simple linear model.