CNNs Flashcards
What is a Convolutional Neural Network?
CNNs are not explicitly defined in the provided material. However, based on the concepts, a CNN is a type of neural network primarily used for image and video analysis. It uses convolutional layers to automatically learn spatial hierarchies of features, making it effective for tasks like image classification, object detection, and image segmentation.
Why are CNNs preferred over fully connected networks for images?
The sources do not directly address this, but it can be inferred that CNNs avoid issues of too many parameters in fully connected networks by using convolutional layers, where a neuron doesn’t need to see the whole image to discover patterns. This makes them efficient for image processing.
What is a convolutional layer?
A convolutional layer applies a set of filters (or kernels) to small regions of the input image, extracting local features. The same set of parameters (weights) is used across the entire image, making them efficient and effective at identifying repeating patterns.
How do CNNs detect patterns in different regions of an image?
The sources do not directly address this, but it can be inferred that RNNs require memory to understand the context in sequential data, where the next data point depends on previous data points. For example, in sentences, the meaning of a word often depends on the words before it.
What is max pooling and why is it used?
Max pooling is not directly discussed in the provided sources, however, it is a type of subsampling that reduces the spatial dimensions of the feature maps. It selects the maximum value from each region and helps to make the model more robust to small shifts or changes in the input.
What does flattening do in a CNN?
Flattening is not directly mentioned, but it is a process that takes the multi-dimensional output of the convolutional and pooling layers and transforms it into a one-dimensional vector, which is then fed into a fully connected feedforward network for classification
What does a typical CNN architecture look like?
A typical CNN, inferred from the information in the sources, includes a sequence of convolutional layers, max pooling layers, and fully connected layers at the end, often with a flattening step between pooling and fully connected layers.
How are the parameters in a CNN determined?
The parameters in CNNs (i.e., the filter weights) are learned during the training process using backpropagation. The convolutional and pooling layers share weights between neurons in the same plane to reduce the number of parameters
What is ReLU and why is it used in CNNs?
The Rectified Linear Unit (ReLU) is a common activation function in CNNs defined as f(x) = max(0,x). It is used because it introduces non-linearity to the network and helps to prevent the vanishing gradient problem.
What are some common applications for RNNs?
Some well-known CNN architectures, though not explicitly mentioned, can be inferred, and would include LeNet-5, AlexNet, GoogLeNet, and ResNet.
■LeNet-5 was created by Yann LeCun in 1998 and is widely used for handwritten digit recognition.
■GoogLeNet has 10 times fewer parameters than AlexNet.
■ResNet utilizes skip connections and is an extremely deep CNN composed of 152 layers.
What is the input shape of a CNN?
The sources do not explicitly address this, but it can be inferred that the input shape for a CNN depends on the data. For instance, for black and white images the input shape might be (1, 28, 28), where 1 is the black/white channel and 28 x 28 is the size of the image. For color images the input could be (3, 28, 28) where 3 refers to the RGB channels.