week 2 - CNN's Flashcards

1
Q

what is a CNN convolutional layer

A

This layer applies convolutional filters (kernels) to the input data (e.g., an image) to detect local patterns such as edges, textures, or more complex features as the network deepens.
Each filter slides (or “convolves”) over the input image, performing a dot product between the filter and a small local region of the input image. This results in a feature map.
The convolutional operation allows the network to focus on local dependencies (spatial hierarchies), making it more efficient than fully connected networks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

explain local dependencies and spatial hierachys in the convolutional layer

A

The convolutional operation focuses on local dependencies by looking at small, localized regions of the image and detecting basic features (e.g., edges, textures). This process builds up spatial hierarchies where simple features combine into more complex ones as the network goes deeper. This local, hierarchical approach makes CNNs more efficient than fully connected networks, as they require fewer parameters and computational resources while still being able to learn complex patterns from data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is the sparesity aspect of the curse of dimensionality?

A

as dimensions increase, the number of data points becomes increasingly sparse

this makes it harder for ml agorithms to find meaningful patterns because there are fewer instances of meaningful data points

This means that models need more training data to find patterns

It also means that models are more likely to overfit to the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

how to solve the curse of dimensionality in image classification by convoluting

A

just take a small portion of the image - e.g 9 pixels. however this doesn’t work if theres no single portion of the image that contains enough information for a classification

to get around this just do it for different portions of the image to get a feature map

then get the features out of the first feature map, and then do that again on a second feature map

this creates an MLP where each succesive layer learns to capture more pieces of information about the original image

because each feature map is smaller than the previous one, each layer condenses information about the original image. however, the smaller layers are also deeper

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

why, in a CNN, is the classifier able to work with a significantly reduced quantity of information

A

because the feature maps contain compressed information, however this information is still reflective of the image as a whole

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

why do CNNs work with images?

A

because local neighbourhoods in the image contain similar information

if the pixels were scrambled, the accuracy of CNNs would decrease to the same as an MLP. This is because MLP’s don’t take into account the ordering of the pixels

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what else can you use CNN’s for?

A

neural style transfer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

how does neural style transfer work

A

you train an image classifier on natural images to recognise patterns in images

Then you feed into it your image and your style target,

you have a content loss function and a style loss function

for the content, you extract the features at some point in the middle layers, before it reaches the actual classifiers. the content feature extraction ensures that the larger features of the image are present in the generative image. Content loss = mse

for the style, the correlation is computed with a gram matrix (correlation matrix). this tells us the correlations between different features in the style image. This tells us about style/texture, because style is determined by global patterns which are spatially independent. the style loss is then calculated as the difference in the gram matrix between the target image and the style image.

pixels are then adjusted to minimise in balance the content loss and the style loss functions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

why use the gram matrix in the style loss function, and not MSE

A

The Gram matrix is used in the style loss function because it captures the correlations between features (texture patterns) across the image, which is essential for transferring the style of one image to another. It focuses on global structure and spatial relationships between feature activations rather than pixel-wise differences.
MSE, on the other hand, is more suited for tasks where the exact values of features (such as pixel intensities or specific activations) are important, which is why it is more commonly used for content loss (where exact pixel similarity is desired).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

in neural transfer, what are the parameters of the model?

A

the parameters of the model
so updating the parameters is equivalent to changing the pixels

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is an application of neural style transfer in neuroscience?

A

super-resolution MRI scanners are expensive

hyperfine swoop scanners are much cheaper, however images are worse

you could use deep learning U-net techniques to improve the hyperfine images

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is transposed convolution?

A

a way to increase the spatial resolution of feature maps/upsamples them

it is a reverse operation of convolution which downsamples.

you can go from compressed representation, to whole size images

if you add in style loss to the MSE loss when upsampling, you can end up with a much higher resolution image than previous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly