Learning from Data: Images Flashcards
How are images represented within a computer?
Images are represented as numerical matrices, where each pixel is a value indicating its intensity (e.g., 0–255 for grayscale). For color images, three matrices (red, green, blue channels) are used.
What was one of the major issues preventing widespread use of CNNs after LeNet’s introduction?
Limited computational power and lack of large-scale labeled datasets prevented the widespread use of CNNs after LeNet’s introduction.
How do residual (skip) connections help address the vanishing gradient problem?
Residual connections allow gradients to bypass certain layers, preserving their strength during backpropagation and enabling the training of deeper networks.
Why can’t traditional dense neural networks effectively process image data?
Dense networks flatten images into vectors, losing the spatial structure (e.g., pixel relationships), and require impractical numbers of parameters to handle high-dimensional image data.
What is fine-tuning in the context of CNN models?
Fine-tuning is adapting a pre-trained CNN model to a new task by adjusting its parameters on a smaller, task-specific dataset.
What is the purpose of applying convolutional filters to an image?
Their purpose is to detect specific patterns and features in images (the data they’re given).
What are the four types of detection tasks that CNN filters can perform according to the slides?
- Edge detection
- Corner detection
- Texture detection
- Object detection.
What is the significance of CNNs being ‘equivariant to translations’?
Because if we shift the features of an image around, the CNN will still be able to detect those features.
Why does shifting an image cause problems when using a flattened vector approach to image processing?
It disrupts spatial relationships between pixels, making it harder to recognize patterns like edges or shapes. It thinks it’s a different image
What role did GPUs play in advancing CNN technology?
GPUs enabled faster training of CNNs by handling the massive parallel computations required for convolutional operations.
Why might a fine-tuned ResNet model perform better than a model trained from scratch on a specific task?
A fine-tuned ResNet uses pre-trained features, saving time and data, while a model trained from scratch has to learn everything from the beginning
What are the two reasons that CNNs have become more widely used in research?
- Advances in hardware (GPUs)
- Availability of large datasets.