ML Flashcards
Input Layer
The input layer takes the raw pixel data of an image. In this case, the input is an image of a car, likely represented as a 2D array of pixel intensities.
The input dimensions typically include height, width, and channels (e.g., RGB channels for colored images).
Convolution + ReLU Layer (1st
Convolution: This layer applies a set of filters (kernels) over the input image to extract features such as edges, textures, or patterns. Each filter produces a feature map.
ReLU (Rectified Linear Unit): An activation function applied element-wise to introduce non-linearity, ensuring the model can learn complex patterns. It replaces negative values with zero.
Pooling Layer (1st Pooling)
Pooling Layer:
Pooling: Typically, max pooling is used. It reduces the spatial dimensions of the feature maps while preserving the most significant information. This reduces computational cost and helps avoid overfitting by downsampling.
Convolution + ReLU Layer (2nd Convolution)
Second Convolution Layer + ReLU:
This stage extracts higher-level features by applying more filters. The patterns detected in this layer are more abstract (e.g., shapes or parts of objects).
Pooling Layer (2nd Pooling)
Further downsamples the feature maps, reducing the spatial dimensions further.
Flatten Layer
The 2D feature maps from the last pooling layer are reshaped into a 1D vector to prepare for input into the fully connected layers.
Fully Connected Layer
These layers act like a standard neural network. Each neuron in the dense layer connects to every neuron in the previous layer.
The fully connected layers combine the high-level features extracted by the convolutional and pooling layers to make predictions.
Softmax Layer (Output Layer
The output layer applies the softmax function to convert raw scores (logits) into probabilities for each class. For example, the network outputs probabilities for categories like car, truck, van, bicycle, etc.