Principles of deep learning in artificial networks Flashcards

1
Q

Deep learning approach (1)

A

Learn from experience (machine learning):

  • No formal rules of transformations
  • No ‘knowledge base’
  • No logical inference
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Deep learning approach (2)

A

Process inputs through a hierarchy of concepts:

  • Each concept defined by its relationship to simpler concepts
  • So, build complicated concepts out of simpler concepts
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Course goal (1)

A

Explore the relationship between cognitive science and AI

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Course goal (2)

A

Focus on deep learning in artificial machine learning networks and comparison to biological systems

  • Which biological processes do deep networks imitate?
  • What is missing in artificial networks?
  • What might make AI/machine learning more like biological intelligence/learning
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Course goal (3)

A

Become familiar with the use of AI in cognitive science research

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Course goal (4)

A

Build some deep learning networks to do human-like tasks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Why deep learning?

A

AI has made great advances in tasks that are:
- Described by formal mathematical rules
- Relatively simpel for computers
- Difficult for humans
AI had been less effective in tasks that are:
- Hard/impossible to describe using formal mathematical rules
- BUT easy for humans to perform (intuitive or automatic)
Simulation of neural computation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Representation & features

Machine learning performance depends on the ….

A

representation of the case to be classified

what information the computer is given about the situation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Representation & features

Each piece of input information is knows as a …

A

feature
(the same feature can be represented in different formats, often easy to convert between formats. The chosen format strongly affects the difficult of the task

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Representation in deep networks

A
  1. Useful features may need to be transformed or extracted first.
  2. So deep networks have multiple representations -> each is build from an earlier representation
  3. This can: Transform features to a different format before learning their links to the output AND extract complex features from simpler features
  4. Essentially multiple steps in a program
    - Each layer can be seen as the computer’s memory state after executing a set of instructions.
    - Deeper networks execute more instructions in sequence
  5. Just like a computer program, the individual steps are generally very simple.
    - Complex outcomes emerge from interactions between many simple steps
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a deep network?

A

A learning network that transforms or extracts features using:

  • Multiple nonlinear processing units
  • Arranged in multiple layers with:
  • Hierarchical organisation
  • Different levels of representation and abstraction
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

20th century view of object recognition

A
  1. Builds a representation of local image features
  2. Builds a representation of larger-scale shapes and surfaces
  3. Matches shapes and surfaces with stored object representations-recognition
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Why nonlinear functions?

A

Any operation that can be done with only linear functions of the input can be straightforwardly described by formal mathematical rules, so is not a good use fore deep networks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Name the complex nonlinear function with four operations or processing steps

A

Filter, threshold, pool and normalize

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Name 1 issue which arises with ReLU

A

Is has no maximum output, while a biological neuron does have a maximum firing rate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does the filter operation do especially?

A

The response of each unit depends on several neighbouring inputs. So the units after filtering respond to a certain area of the input image, and the activation of neighbouring units will often be similar. After several filter steps, each integrating inputs over an area, each unit will respond very similarly to an extensive area of the input. So neighbouring units are representing very similar information.

17
Q

What does the pooling operation do?

A

Downsamples the units to improve computational efficiency. Discards some data in favour of computational efficiency.

18
Q

The threshold and pool operations use …

A

max functions. That is why by the pool stage we have a mean activation above zero and an arbitrary range.

19
Q

What does the normalisation operation do?

A

It linearly scales the data to have a mean of zero activation for each feature map’s responses to all images.

20
Q

Why is normalisation important? Name 4 reasons.

A
  1. Machine learning generally assumes that data reflects measurements of independent and identically-distributed (IID) variables. Normalisation forces identical distributions.
  2. If the activation function depends whether the units response is above or below zero, having zero-mean inputs and zero-mean filters, about half of the units will be active and half inactive. This even split of activation is a very efficient way to store information in a network of limited size.
  3. Having the same range for all feature maps and layers means the same maximum threshold in the activation function can be sued throughout the network..
  4. As a result of these consideration and other technical considerations, training rates are far better after normalisation, and final classification accuracy.
21
Q

Filter/convolve:

A

determine how well each group of nearby pixels matches each of a group of filters

22
Q

Threshold/rectify:

A

introduces a nonlinearity by setting negative activations of units to zero (and maybe set a maximum activation)

23
Q

Pool:

A

Downsample the units to improve computational efficiency

24
Q

Normalise:

A

Rescale responses of each feature map to have mean zero and standard deviation one, so each feature map contributes similarly to classification

25
Q

As we get higher up the network, these filters get harder to understand in two important ways. Name them.

A
  1. The filter shape crosses multiple independent feature maps. An edge detector applied to an image is easy enough to conceptualise, but such a high-dimensional filter is harder to conceptualise.
  2. The input feature maps become more abstract. It gets very hard to conceptualise what feature is represented.
26
Q

Name the reasons why shared weights are used in artificial deep networks (and which do not apply in biological deep networks)?

A

Filters generally have a single set of weights for all positions in the feature map because:

  1. If a feature is useful to compute at one position, it is probably also useful at another position.
  2. The filter values are weights that need to be learned. It is very computationally demanding to do this if the set is too large.
  3. The convolution operation is a very fast matrix function. If filters are not fixed, the convolution operation cannot be used.
27
Q

The softmax operation …

A

The weights through our network will transform each input image into a ‘score’, reflecting the match between the top layer’s pattern of activation by previous examples of each category. This score must then be converted to a probability that this input image falls into each category. This is almost always done with the ‘softmax’ function.

28
Q

Filter structures are targets for machine learning …

A

The convolution filters are the main links between different layers of our network, so they effectively form the weights of connections between the nodes in a neural network. Here, the nodes are pixels in a feature map, and the connections between these are filters. So, to learn the weight of connections, the network learns the structure of the filters.