11 - The Eyes of a Machine Flashcards
Who are the co-founders of the Department of Neurobiology at Harvard?
David Hubel and Torsten Wiesel
They were awarded the Nobel Prize in Physiology or Medicine in 1981.
What was the main focus of Hubel and Wiesel’s research?
The visual system of cats
Their work involved mapping the visual cortex and understanding neural responses to visual stimuli.
What invention did Hubel create in 1957 for recording neuron activity?
A tungsten electrode
This electrode was preferred over glass micropipettes and steel electrodes for its durability.
What anesthetic was used on cats during Hubel and Wiesel’s experiments?
Intraperitoneal thiopental sodium
This was administered to keep the cats under anesthesia during the experiments.
What is a ‘visual field’?
The region in front of us that our eyes are sensitive to
It changes as we move our eyes and focus on different objects.
Define ‘receptive field’.
The portion of the visual field that triggers a single neuron
The size of the receptive field can vary significantly among different neurons.
What are retinal ganglion cells?
Neurons that monitor the image on the retina
They are the first layer of neurons receiving inputs from the retina.
What is the function of a ‘simple cell’ in the visual cortex?
It fires only when all its connected retinal ganglion cells fire together
Simple cells are sensitive to specific orientations of edges.
What does a ‘complex cell’ do?
It fires in response to an edge regardless of its position in the receptive field
This indicates spatial or translational invariance.
What does ‘invariance’ refer to in the context of vision?
The ability to recognize stimuli regardless of their position or orientation
This includes translational and rotational invariance.
What is a ‘hypercomplex cell’?
A cell that fires for an edge of a specific length and orientation
It can detect complex shapes and patterns.
What was the name of the first neural network-based image recognition system?
Cognitron
Developed by Kunihiko Fukushima in 1975.
How does the learning algorithm in the cognitron work?
It strengthens synaptic connections based on neuron firing activity
This is akin to Hebbian learning principles.
What limitation did the cognitron have?
It was not translation invariant
This means it recognized patterns differently based on their position in the visual field.
What architecture did the neocognitron adopt from Hubel and Wiesel’s work?
It includes S-cells and C-cells
These cells model simple and complex cells found in the visual cortex.
What is the role of S-cells in the neocognitron?
They respond to specific features like vertical edges
S-cells feed into C-cells to indicate edge presence.
What does the term ‘grandmother cell’ refer to?
A hypothetical neuron that fires when seeing a specific complex stimulus
This concept is often used to illustrate the idea of specific neural responses to familiar faces.
What are C-cells in the context of the visual processing model?
C-cells respond to vertical edges in different patches of the visual field.
What is the role of S-cells in the visual processing architecture?
S-cells collate outputs from C-cells to detect edges anywhere in the visual field.
What does translation invariance refer to in the neocognitron?
The ability to detect patterns regardless of their position or distortions.
Who developed the neocognitron and what was its significance?
Fukushima developed the neocognitron, which advanced pattern recognition in machines.
What was the main limitation of the neocognitron’s training algorithm?
It adjusted only the weights of the S-cells and was cumbersome.
What breakthrough did Yann LeCun achieve in the field of neural networks?
He developed the convolutional neural network (CNN) using the backpropagation algorithm.
How did the meeting between LeCun and Fukushima impact the field?
Fukushima was surprised to learn that LeCun’s work was parallel to his own on the same topic.
Who were the key figures in the debate about cognitive capabilities, and what were their positions?
Piaget believed in learning during development; Chomsky argued for innate capabilities.
What was Papert’s stance during the Piaget-Chomsky debate?
Papert supported Piaget, arguing that Chomsky underestimated the role of learning.
What is a perceptron?
An early type of artificial neural network used to analyze learning and cognition.
What was LeCun’s key realization regarding learning algorithms?
A learning algorithm should minimize an objective function.
What is the difference between a loss function and an objective function?
The objective function includes a regularizer to prevent overfitting, while the loss function does not.
What is the purpose of a regularizer in machine learning?
To prevent overfitting by adding an extra term to the loss function.
What was the focus of LeCun’s doctoral work?
Developing a learning algorithm related to backpropagation for multi-layer neural networks.
What did LeCun present at a conference in 1985?
A poorly written paper in French on his learning algorithm.
What significant software did LeCun and Bottou create?
SN, which later evolved into Lush, an early neural network simulation tool.
What major dataset did LeCun work with at Bell Labs?
Images of handwritten digits from the U.S. Postal Service.
What is the convolution operation in the context of image processing?
An operation that combines two functions to produce a third, particularly for feature extraction.
What are kernels in convolutional neural networks?
Small matrices used to filter images and detect features such as edges.
What is the effect of convolution on the size of an image?
Convolution typically reduces the size of the output image.
What is the stride in convolution operations?
The number of pixels by which the kernel moves across the image during convolution.
What is the formula for calculating the output image size after convolution?
((i - k) / s) + 1, where i is the input size, k is the kernel size, and s is the stride.
True or False: The original image size remains the same after applying a convolution operation.
False.
What is the term used for the distance the kernel moves in a convolution operation?
Stride
The stride can affect the size of the output image.
How is the output image size calculated in a convolution operation?
(( i - k ) / s ) + 1
Where i is the input image size, k is the kernel filter size, and s is the stride.
What is padding in the context of convolutional networks?
Dummy pixels added around the input image
Padding can affect the output size and feature extraction.
What is the output size when using a 3×3 kernel on a 28×28 image with a stride of 1?
26×26
This is calculated without padding.
What represents the weighted sum of pixel values in a convolution operation?
The output of a neuron
The weights correspond to the kernel values.
How many neurons are needed for a 5×5 image with a 2×2 kernel and a stride of 1?
16 neurons
This produces a 4×4 output image.
Define receptive field in the context of convolutional neural networks.
The specific area of the image that a neuron responds to
Each neuron has its own region of interest.
What are simple cells and complex cells in the hierarchy of visual processing?
Simple cells respond to simple features and complex cells respond to compositions of those features
This hierarchy was posited by Hubel and Wiesel.
What is max pooling in convolutional neural networks?
An operation that outputs the largest pixel value from a region under the filter
Max pooling reduces image size and increases the receptive field.
What is the effect of max pooling on the size of an image?
Reduces the size of the image
For example, a 4×4 image with a 2×2 filter and stride of 2 results in a 2×2 image.
What is the purpose of backpropagation in training neural networks?
To calculate gradients and update weights
This process minimizes the error between expected and actual outputs.
What is the role of activation functions in neural networks?
To introduce non-linearity and enable backpropagation
Activation functions must be differentiable.
What are hyperparameters in the context of neural networks?
Parameters not learned during training that influence performance
Examples include the number of layers, kernel sizes, and activation functions.
What was LeNet used for in the banking industry?
To read and recognize digits on checks
LeNet was one of the early successful applications of convolutional neural networks.
What does the output layer of a neural network for digit recognition typically consist of?
10 neurons
Each neuron corresponds to a digit from 0 to 9.
What happens during stochastic gradient descent?
A subset of images is used for each pass through the network
This is a method to optimize the learning process.
What was a significant limitation of convolutional neural networks in the 1990s?
Lack of general-purpose software to build CNNs
AT&T did not allow the distribution of their software open source.
What algorithms were outperforming CNNs for low-resolution images in the 1990s?
Conventional techniques
CNNs were still not widely adopted due to skepticism in the computer vision community.
What hardware advancement in the 2000s significantly changed deep learning?
Graphical Processing Units (GPUs)
GPUs were originally designed for rendering 3D graphics but proved vital for deep learning tasks.
What dataset did Jürgen Schmidhuber’s team use to train multi-layer perceptrons?
MNIST images
They achieved low error rates of 0.35 percent using deep learning techniques.
What was the name of the first massive CNN built by Hinton’s lab?
AlexNet
AlexNet demonstrated that conventional methods for image recognition could not compete.
What was the purpose of vector maps in Hinton and Mnih’s research?
To teach neural networks how to label pixels in aerial images
Vector maps provided clear information about road locations.
What programming interface allowed GPUs to be used for general-purpose tasks?
CUDA
CUDA enabled engineers to perform tasks beyond just graphics acceleration.
What breakthrough did Hinton’s lab achieve in speech recognition?
Using CUDAMat to program deep neural networks
This demonstrated the versatility of GPUs in different machine learning tasks.
What was Sutskever’s view on the limitations of support vector machines (SVMs)?
The ceiling for SVMs is low compared to neural networks
He believed neural networks had a higher potential if trained correctly.
What was the ImageNet challenge introduced in 2010?
Train a system to categorize 1.2 million images into 1,000 categories
It aimed to advance the field of computer vision.
What was the top-5 error rate achieved by AlexNet in the ImageNet competition?
17 percent
This was significantly lower than the previous winners’ rates.
What activation function did AlexNet use for its neurons?
Rectified Linear Unit (ReLU)
This was a departure from the traditional sigmoid function.
What impact did deep neural networks have on various fields?
Revolutionized computer vision, natural language processing, machine translation, and more
Their applications are vast and continually growing.
What did Mikhail Belkin compare the current state of machine learning to in physics?
The emergence of quantum mechanics
He suggested that deep neural networks are leading to new theories in machine learning.