Chapter 4, W5: Elements of Connectionist Cognitive Science Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

Cartesian rationalism

A

The notion that the products of thought were rational conclusions drawn from the rule-governed manipulation of pre-existing ideas.
- The Cartesian view that thinking is equivalent to performing mental logic—that it is a mental discourse of computation or calculation —has inspired the logicism that serves as the foundation of the classical approach.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Empiricism

A

The view that the source of ideas was experienced; i.e., the source of all ideas is experience.
- Locke was one of the pioneers of empiricism, a reaction against Carteisn philosophy
- Locke argued for experience over innateness, for nurture over nature

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Connectionist Elements

A
  • Element 1: Association
  • Element 2: Decision
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Artificial Neural Network

A

The basic medium of connectionism is a type of model called an artificial neural network (AFN), or a parallel distributed processing (PDP) network. AFNs are:
- “neuronally inspired” networks
- built from simple processors (artificial neurons)
- learn from experience
- operate in parallel

Artificial neural networks are exposed to environmental stimulation—activation of their input units—which results in changes to connection weights.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Laws of Association (Arsitole)

A

Fundamental to connectionism’s empiricism is the key idea
of association: different ideas can be linked together, so that if one arises, then the association between them causes the other to arise as well.
- Contiguity or habit
- similarity
- contrast

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

The behaviour of a processor in an artificial neural network, which is analogous to a neuron, can be characterized as follows:

A
  1. the processor computes the total signal (its net input) being sent to it by other processors in the network.
  2. the unit uses an activation function to convert its net input into internal activity (usually a continuous number between 0 and 1) on the basis of this computed signal.
  3. the unit converts its internal activity into an output signal and sends this signal on to other processors.

A network uses parallel processing because many, if
not all, of its units, will perform their operations simultaneously.

The signal sent by one processor to another is a number that is transmitted through a weighted connection, which is analogous to a synapse.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Weight

A

The weight is a number that defines the nature and strength of the connection. For example, inhibitory connections have negative weights, and excitatory connections have positive weights. Strong connections have strong weights (i.e., the absolute value of the weight is large), while weak connections have near-zero weights.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Tabula Rasa

A

Tabula rasa, or the blank slate: the notion of a mind being blank in the absence of experience. Modern connectionist networks can be described as endorsing the notion of the blank slate.

This is because prior to learning, the pattern of connections in modern networks has no pre-existing structure. The networks either start literally as blank slates, with all connection weights being equal to zero or they start with all connection weights being assigned small, randomly selected values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Interpret the Figure: Modifiable Connections

A

Figure 4-1 is James’ biological account of association:
- Illustrates two ideas, A and B, each represented as a pattern of activity in its own set of neurons. A is represented by activity in neurons a, b, c, d, and e; B is represented by activity in neurons l, m, n, o, and p. The assumption is that A represents an experience that occurred immediately before B. When B occurs, activating its neurons, residual activity in the neurons representing A permits the two patterns to be associated by the law of habit. That is, the “tracts” connecting the neurons (the “modifiable connections” in Figure 4-1) have their strengths modified.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How does this figure reveal three properties that are common to modern connectionist networks.

A
  • First, the system is parallel: more than one neuron can be operating at the same time.
  • Second, the system is convergent: the activity of one of the output neurons depends upon receiving or summing the signals sent by multiple input neurons.
  • Third, the system is distributed: the association between A and B is the set of states of the many “tracts” illustrated in Figure 4-1; there is not just a single associative link
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Hebb Rule

A
  • Connectionists use the Hebb Rule: “neurons that fire together wire together”
    When an axon of cell A is near enough to excite cell B and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that A’s efficiency, as one of the cells firing B, is increased.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Standard Pattern Associator (“Distributed Memory”)

A

The standard pattern associator, which is structurally identical to Figure 4-1 is a memory capable of learning associations between pairs of input patterns or learning to associate an input pattern with a categorizing response.
- The standard pattern associator is empiricist in the sense that its knowledge is acquired by experience.
- Memory = begins as a blank slate (i.e., all of the connections between processors start with weights equal to zero.)
- During a learning phase: pairs of to-be-associated patterns simultaneously activate the input and output units in Figure 4-1
- With each presented pair, all of the connection weights— the strength of each connection between an input and an output processor—are modified by adding a value to them. Value follows Hebbs Rule.

The standard pattern associator is called a distributed memory because its knowledge is stored throughout all the connections in the network, and because this one set of connections can store several different associations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Classical Conditioning

A

Classical conditioning is an example of the associative law of contiguity at work.
- The conditioned stimulus is not associated with the unconditioned response.
- After repeated pairings with an unconditioned stimulus (contiguity), the conditioned stimulus becomes associated with the desired response.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Emergence

A

where the properties of a whole (i.e., a complex idea) are more than the sum of the properties of the parts (i.e., a set of associated simple ideas).

Emergent properties are often defined as properties that are not found in any component of a system but are still features of the system as a whole.
- emergence results from nonlinearity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How is this system considered a linear system?

A

If a system is linear, then its whole behaviour is exactly equal to the sum of the behaviours of its parts. Output unit activity is exactly equal to net input.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How do neurons demonstrate nonlinear processing?

A

Through their action potentials.
- The inputs to a neuron are weak electrical signals, called graded potentials, which stimulate and travel through the dendrites of the receiving neuron.
- If enough of these weak graded potentials arrive at the neuron’s soma at roughly the same time, then their cumulative effect disrupts the neuron’s resting electrical state. Results in an action potential.
- A crucial property of the action potential is that it is an all-or-none phenomenon, representing a nonlinear transformation of the summed graded potentials.
- The all-or-none output of neurons is a nonlinear transformation of summed, continuously varying input

17
Q

Heaviside step function

A

A nonlinear function that, when replacing the linear function as the activation function, can turn a standard pattern associator into a perception.

An example of this function: The mcCulloch-Pitts neuron
Like the output units in the standard pattern associator, the neuron first computes its net input by summing all of its incoming signals. However, it then uses a nonlinear activation function to transform net input into internal activity.
- Uses the Heaviside step function this function compares the net input to a threshold.
- if net input > threshold, activity = 1, otherwise, = 0

18
Q

Perceptron

A

Artificial neural networks that could be trained to be pattern classifiers: given an input pattern, they would use their nonlinear outputs to decide whether or not the pattern belonged to a particular class. HOWEVER When the Heaviside step function is used as an activation function, the perceptron only defines a single straight cut through the pattern space and therefore
can only deal with linearly separable problems.
- they assign perceptual predicates
- the nonlinear activation function used by perceptrons allows them to assign perceptual predicates; standard pattern associators do not have this ability.
- a “neuronally inspired” network
- built from simple processors (artificial neurons) that operate in parallel
- learns from experience

19
Q

Predicate

A

A predicate is the part of a sentence, or a clause, that tells what the subject is doing or what the subject is. Let’s take the same sentence from before: “The cat is sleeping in the sun.” The clause sleeping in the sun is the predicate; it’s dictating what the cat is doing.

20
Q

Nonlinear activation functions (activation functions)

A
  • Neurons do not merely associate, they make decisions based on their incoming signals
  • A second connectionist element is a nonlinear activation function f(x)
  • This function serves to make a decision about incoming signals
  • More commonly called activation functions
21
Q

Explain how (A) in Figure 4-2 is an example of a linearly separable problem:

A

You can create a simple linear boundary between the two decision regions, using a linear function.

This is because a single straight cut through the pattern space divides it into two decision regions that generate the correct pattern classifications. The dashed line in Figure 4-2A indicates the location of this straight cut for the AND problem. Note that the one “true” pattern falls on one side of this cut, and that the three “false” patterns fall on the other side of this cut.

22
Q

Explain how (B) in Figure 4-2 is an example of a linearly nonseparable problem:

A

An example of a linearly nonseparable problem is the XOR problem, whose pattern space is illustrated in Figure 4-2B. Now it is impossible to separate all of the black points from all of the white points with a single straight cut. Instead, two different cuts are required, as shown by the two dashed lines in Figure 4-2B. This means that XOR is not linearly separable.

23
Q

Perceptron Limitations

A
  • A perceptron can solve a linearly separable problem
  • A perception cannot solve a linearly non-separable problem
24
Q

Hidden Units (Multi-layered perceptrons)

A

Additional processing units within the layers of a perception; i.e., are intermediaries between input and output units.

Hidden units can detect additional features that transform the problem by increasing the dimensionality of the pattern space. As a result, the use of hidden units can convert a linearly nonseparable problem into a linearly separable one, permitting a single binary output unit to generate the correct responses.

Figure 4-4 shows how the AND circuit illustrated in Figure 4-3 can be added as a hidden unit to create a multilayer perceptron that can compute the linearly nonseparable XOR operation