Prelim 1 Flashcards
What is a “homunculus”?
A “homunculus” is a little human. In cognitive science, it refers to an argument that accounts for a phenomenon in terms of the very phenomenon that it is supposed to explain, which results in an infinite regress.
Example #1:
Bert: How do eyes project an image to your brain?
Ernie: Think of it as a little guy in your brain watching the movie projected by your eyes.
Bert: Ok, but what is happening in the little guy in your head’s brain?
Ernie: Well, think of it as a little guy in his brain watching a movie…
What is the problem with appealing to a homunculus, as an explanation of mental faculties?
Unilluminating because it simply appeals to another homunculus to explain the problem. Infinite regress.
How do functionalism and decompositionism each contribute to a better explanation than appealing to a homunculus?
Functionalism -> Mental states are constituted by their causal (functional) role, not by their material make-up
Ex: what makes something memory (say) is the function it has in the overall system, not that it is the hippocampus
So if two systems have the same function, then they are the same mental capacity
Decompositionism -> Each mental capacity is built up out of other, less intelligent capacities (Reigning solution to the problem of the homunculus)
ex -> langauge comprehension
Thus, both theories provide an alternative to appealing to a homunculus by defining a certain capacity by less intelligent capacities as well as its functional role instead of saying that a capacity is controlled by a homunculus
example of decomposition w/ language comprehension
To be able to understand one sentence you have analyze grammar, understand each word
Everything can be reduced
Classical reductionism
Special sciences capture truths that can ultimately be restated in terms of lower level sciences, terminating at physics.
Ex: So, Psychology is reduced by Neuroscience just in case the laws of psychology are derivable from those of neuroscience.
Mind-brain identity theory (MBIT)
The general claim of MBIT is that for every type of mental state (e.g. episodic memory), there will be a corresponding type of brain state (some configuration of neurons with certain pattern of activity or organization)
Types of mental states = types of brain states. (E.g.memories = certain types of neural structures, which can be found in human brains)
Functionalism
Mental states are constituted by their causal (functional) role, not by their material make- up; Basic idea of functionalism is that what makes something memory (say) is the function it has in the overall system
Rejection of MBIT
Multiple realizability
When the same type of thing at the higher level (e.g. memory) can be implemented or “realized” in multiple ways at the lower level (e.g. in mammalian brains and in avian brains)
Example:
Every creature in the universe that has memory has to have a hippocampus?
At a minimum, it seems rash to conclude that birds don’t have memory just because they don’t have the same brain structure we do.
More strongly, might think that it’s obvious that birds have memory, so any theory that identifies memory with a brain region is a bad theory.
Context sensitivity
Two individuals of the same type at the lower level can be individuals of different types at a higher level, depending on the context.
The same kind of physical object can be a second-hand gear in one context and a minute-hand gear in another context.
Just having a physical description of the gear, without the context, doesn’t tell you whether the gear is a minute-hand gear.
Why might functionalism make reductionism unlikely to be true?
In psychology, we want to characterize mechanisms by the functions they serve.
As a result, according to anti-reductionism, we are bound to end up with physically dissimilar mechanisms doing the same job (as in the case of memory in birds/humans).
But the anti-reductionist wager is that just as there is no simple one-one mapping between the kinds of molecular biology and the kinds of biochemistry, we can expect no simple one-one mapping between the categories of psychology and the categories of neuroscience.
Validity
An argument is valid if and only if it is impossible for its premises to be true and its conclusion false
Truth gets preserved – with a valid argument you never go from true premises to false conclusions
Is this a sound and/or valid argument?
Sound + valid argument (true premises)
Is this argument sound and/or valid?
Truth preserving… valid, even though false premises and conclusion; so not sound
How does the computer model provide an explanation for how a physical system could be rational?
The mind can operate in a truth preserving way because:
- beliefs are represented in symbol sequences
- like a computer, these symbols have physical properties that the mind/brain manipulates into other symbols (that also have physical properties of course)
- the physical manipulations of the symbols are arranged so that they are truth preserving
Naturalism
how can a physical system be organized so that its causal processes ensure that if it believes something true, those causal processes will lead to other true beliefs (and not false beliefs)
How do turing machines work
The symbols are physical objects with physical characteristics (size, shape, etc.). But the symbols can be interpreted as representing various things.
The rules that govern the physical system (the syntax) are designed such that our interpretation of the symbols (the semantics) will be truth preserving.
What the head does is determined by:
1. the token it finds in the square it’s scanning (e.g., 1, 0) 2. the internal state it is currently in (q0, q1, …)
These factors determine:
1. what token to write on the present square
2. the motion of the head (left, right, halt)
3. what internal state (q0, q1, …) to be in for the next step
How can the computer model of the mind explain the productivity of thought?
productivity of thought - there is an infinite number of thoughts that we are capable of thinking
As we’ve seen, computers are symbol manipulating devices. And there are rules that govern how symbols can be manipulated (e.g., replace “Luke” with “someone”)
This paves the way for productivity. For computers can operate on recursive rules
Roughly, a rule is recursive if the category in the “if” part of the rule (e.g, NOUN) also appears in the “then” part of the rule.
What are the elements of neural networks?
Connectionism (emphasizes fact that the knowledge is contained in the connections)
Neural nets (emphasizes similarity to actual neural networks)
Parallel distributed processing (emphasizes fact that much of the processing is not serial but simultaneous)
+ Deep learning when lots of inner layers
McCulloch & Pitts neuron (pre-wired neural nets)
– Inputs binary (1, 0)
– Each input multiplied by associated weight (e.g., x1 * w1)
– Sum the products of each pair of input * weight (xi * wi)
– Threshold: Neuron fires if that sum exceeds threshold (threshold number can be fraction)
– Output binary: If threshold is exceeded, 1; otherwise, 0
Can capture basic logical relations (AND, OR, NOT) using these kinds of neurons.
AND gate
Rosenblatt’s perceptron (learning neural network)
adds concept of changing weights to the M-P model
This requires calculating the error:
1. Error = target ([1]) – actual output ([0]). 1 – 0 = 1.
Updating also requires specifying a learning rate “alpha”
- Alpha = how much are we going to change the weights in response to the error. We’ll say .5.
New weight = old weight + ([learning rate*error] * value of input) =
w1(new) = w1(old) + ([alpha * (target-output)] * x1)
Do the same process for adjust the weight for x2
Repeat this process for each row of the truth table until you get your desired output
limitations to perceptron model
Neuron is just a single layer, so it can only do certain logic gates (and, or, not); you can’t do exclusive-or/XOR; way to overcome: testing with a multiple-layer neuron
Interactive Activation model
Discovered by Hubel & Wiesel
Certain neurons in visual cortex are highly specialized
Some would selectively respond if organism was presented with vertical line while some were selectively respond with a diagonal line
reproduces the word superiority effect because when there’s a word, there is more activation for the letter unit
Basically since you can have connections to words and letters in both directions, seeing the words creates the creation to the letter, making it easier to remember
how inputs can activate connections
What we can learn from the Interactive Activation model
- Neural net models can naturally capture graded performance (relative speed and accuracy in identifying a letter in a word) given multiple factors being processed in parallel.
- The model shows how the computation of a perceptual representation of an input (a word) might involve simultaneous processing at multiple levels of abstraction (feature, letter, word)
- Although one might have thought that people’s recognition of letters depended on rules about the correct spelling, McClelland says that the model shows that this need not be so
This is shown be the fact that non-words (i.e., strings that violate orthographic rules) can facilitate recognition
General intelligence (learned) vs. instinct (innate)
General intelligence allows us to solve problems and draw on lots of different information
Animals don’t have intelligence they have instinct
“New Look” perception psychology
Top down – intelligence in perception; what you think affects what you perceive
Ex: Context affects perception: perceive word as “cat” even though it is not a letter “a”
Domain Specificity
module only can be turned on by certain types of inputs
EX 1 : emotions → spider (produces more fear) vs. infinity pool, statistically infinity pool is more dangerous EX 2 : facial recognition → facial recognition is not being turned on when looking at an upside down face, THUS face recognition can only be turned on by certain inputs. → we need exactly the right format to turn on
Mandatoriness
cannot control if a module applies to a given input. If input fits, module fires
→ this is the reverse of DS
EX 1 : stroop effect, it’s hard to name what color the font is (blue); vision is faster than reading EX 2 : hollow face illusion → since the features of the face are in the right position the face will automatically correct itself (you will see a face even if it is not structurally correct)
Informational Encapsulation
The processing within the module cannot be influenced by information from higher-level cognition. Modules can only access information within its own database. All by itself.
→ Insulated both from what you want to be true, and from your background beliefs
EX 1 : visual illusions, you know the truth but you cant stop yourself from seeing it because the module changes for you. Background knowledge is not getting in there and changing the way you see it EX 2 : lexical access, bug insect vs. bug microphone. You used context to figure out what the meaning
Explain the lexical disambiguation example
When a word has multiple meanings, you select the most plausible meaning based on the context
“Rumor had it that, for years, the government building had been plagued with problems. The
man was not surprised when he found several spiders, roaches, and other bugs in the corner
of his room.
You would assume bugs in this context refers to the insects, not a microphone
But under the hood, the process itself actually seems to be encapsulated. The fact that the right interpretation is rationally obvious does not get into the mechanism that is involved in going to the lexicon. That mechanism activated both the rational interpretation and the terrible interpretation
The virtues of vertical (modular) and horizontal faculties
Vertical -> different modules (i.e. early vision module, lexical access module)
Horizontal -> decision made from vertical modules
Marr’s 1st level
LEVEL 1: Computational/(ecological) descriptions : the what and why
Guiding Questions: What is the goal of the system?
→ “what is the goal of the computation, why is it appropriate, and what is the logic of the strategy by which it can be carried out?”
What it does and why, why does it do that instead of other functions → to answer you can use context, what makes the most sense
EX. cash register, spider, and fly
Ecological examples of Marr’s 1st level
Ecological examples: insect and arachnid vision
Red-backed jumping spider - has a curious retina formed of two diagonal strips arranged in a V”
The goal of the visual system in the jumping spider is to identify mates and distinguish them from prey.
If we don’t know this about the spider’s visual system, it will be much harder to figure out how the system works.
The fly - one visual system for landing and one visual system for mating
Marr maintains that we would have an incomplete picture of spider vision or fly vision if we missed these computational (ecological) level descriptions.
Marr’s 2nd level
LEVEL 2 : Representation and algorithm : the how
Guiding questions : how can this computational [ecological] theory be implemented? In particular, what is the representation for the input and output, and what is the algorithm for the transformation?”
Representations and algorithm examples for 2nd level
Basically:
i. representation for the input and the output
Example -> for the number 4, we could have: hash marks (||||), arabic base 10 (4), arabic base 2 (100), roman (IV) - all diff representations for the same thing
*if the representation is fixed this constrains the possible algorithms (hashmarks would just concatenate; arabic numerals can use addition tables)
ii. algorithm for manipulating the representations
Example -> consider the various algorithms for multiplication: (3 * 5)
algorithm 1 -> add the second number to itself as many times as indicated by the first number (5 + 5 + 5)
algorithm 2 -> use multiplication
Marr’s 3rd level
LEVEL 3 : Hardware implementation
Guiding questions : how can the representation and algorithm be realized physically?
Marr has little to say about this but he basically says that the relation between levels 1 & 2 and the relation between levels 2 & 3 are similar, “the same algorithm may be implemented in quite different technologies”
Again the main idea is that there is a lot of Multiple realizability
Marr’s 1st stage
Level 1: what is the goal of human vision?
Goal : to provide the viewer with a description of her surroundings that is “useful to the viewer and not cluttered with irrelevant information”
To provide a description of the shapes and positions of things from image intensity values as detected by the photoreceptors in the retina
Marr’s 2nd stage
Level 2: how does vision work?
1) first you input an image that your retina collects, very very detailed and weird
Primal sketch of the scene from image intensities on photoreceptors in the retina.
2) 2.5 D sketch → shows distance between each point in the visual field and the perceiver and orientation.
- step 1 and 2 are perceiver oriented
3) Full 3D model → 3D model removed a lot of the extra info(very detailed information about shape collected from the retina) from input image and is now a constant shape and size.
Marr’s 3rd level of vision
Marr has nothing
Incest avoidance (Westermark hypothesis)
evolutionary mechanism to avoid undesirable alleles and phenotypes from remaining in the population
Cheater detection
if a rule is perceived as a social contract, then a cheater detection algorithm is activated that searches for information that could detect cheaters; looks for people who have intentionally taken the benefit specified in a social exchange rule without satisfying the requirement
Empiricism
caricature: all learning comes from experiences, mind is a blank slate, knowledge comes primarily from sensory experience
agree there is an innate learning mechanism but think that processes are domain general
Nativism
caricature: the mind comes into being with ideas of god, triangle, etc. already present, the environment plays no role in acquiring ideas
believes in an innate learning mechanism + env. may play a role, but believes that processes ae domain specific
Rationalism is used as a synonym for nativism - refers to reliance on reason instead of emotion
domain general
critical thinking can be applied to any topic in any field
(E.g., hypothesis testing (the kind of thing we do in science, Associative learning, Statistical learning)
domain specific
contribution to learning, easiest to characterize negatively: the capacity is acquired in a way that CAN’T be explained as a product of domain general learning mechanisms.
critical thinking ability is conceptualized as being specific to a particular area
There are specific modules to learn different skills (language, auditory processing, etc.)
Birdsong as a case for nativism
shows evidence for domain specific learning mechanisms different birds produce different songs
- This is a model for how to think about innateness.
- The environment plays a critical role, but can’t explain the behavior without some innate contribution that is specific to the task domain (song, even song-in-this-species)
Birdsong and its relation to POS
- The white crown sparrow produces distinctive songs in a way that goes beyond the input.
- If you gave the same acoustic input to a domain general statistics program, would not get the output we see with the sparrow.
- Given the input that the bird gets, must be some species-specific contribution.
Statistical inference
(empiricist learning) allows you to make predictions about lots of things
-Bulk of science driven by statistical inference
-process of drawing conclusions about an underlying population based on a sample of data
Transitional probabilities
how likely it is for one sound to follow another (Ex: “pee” has a higher transitional probability than “bee” in the context of hearing “hap”)
Saffran & Aslin study
Tried to see whether 8 month old babies could use the statistical information, putting artificial words together
After two minutes of exposure, kids were able to tell the difference between patterns; Infants listened longer for “part words” that didn’t respect the word boundaries
Conc: Children have general capacities for statistical reasoning (capacity for statistical inference) that can be used to draw conclusions about the world
Empiricist account of how learning works
An empiricist theory of learning capacity X holds that we learn this capacity through a
general, all purpose learning mechanism it applies to anything.
Empiricist learning proceeds by applying domain general learning mechanisms
(e.g., statistical inference, hypothesis testing) to environmental input.
Does the Poverty of stimulus argument conclude that language competence requires a domain-general or domain-specific learning mechanism?
POS argument (Chomsky) concludes that:
The poverty of the stimulus argument is the claim that primary linguistic data (i.e. the linguistic utterances heard by a child) do not contain enough information to uniquely specify the grammar used to produce them.
Contradicts empiricist theory regarding language learning; there are some innate features to language learning
Language competence must require a domain specific learning mechanism
“H1”
Process the declarative from beginning to end (left to right), word by word, until reaching the first occurrence of the words is, will, etc. transpose this occurrence to the beginning (left), forming the associated interrogative
ignores the structure of the sentence, what the words mean
Domain general, sentence doesn’t make sense, only pay attention to major words
“H2”
BETTER DESCRIBES how statements are converted into question
Process the declarative from beginning to end (left to right), word by word, until reaching the first occurrence of the auxiliary is, will, etc. **following the first noun phrase of the declarative **, transpose this occurrence to the beginning (left), forming the associated interrogative
pays attention to statement or question, more abstract, cares about the order
Domain specific, sentences start to make sense, when you grow older
On Chomsky’s picture, language learning depends on a mechanism that is: innate, domain-specific, and constitutes a “universal grammar.” What do each of those claims mean?
Innate = natural
domain specific = language acquisition
universal grammar = kids already know how grammar/linguistic structure works without being taught
What are the steps in a Poverty of stimulus argument?
1) Specify piece of knowledge (i.e. H2)
2) Identify some indispensable input for acquiring the knowledge via domain general learning (If the child has already settled on a general theory of what kinds of grammars are appropriate, she might not need specific evidence about declarative/interrogative for “The man who is tall is here”)
3) Show that this indispensable input is inaccessible to the learner (i.e. “Where’s the little blue crib that was in the house before?”)
4) Show that the knowledge is acquired at a young age (and in particular, before the indispensable input is accessible)
(i.e. Although children never make this mistake:
“Is the man who tall is here?” they do make this one:
“Is the man who is tall is here?”)
Elements of classical conditioning using Pavlov’s dog experiment
Neutral Stimulus (NS): sound of the bell
Unconditioned Stimulus (UCS): food
Unconditioned Response (UCR): salivation
Conditioned Stimulus (CS): bell
Conditioned Response (CR): salivation
Extinction
Presenting the conditioned stimulus without the US generates a new inhibitory connection. This eventually is strong enough to block the response
Getting used to something, responding less
Generalization
Pavlov found that when he stimulated other parts of the dog’s body, there was still a good deal of salvation for parts that were close to the thigh, and this dropped off significantly as the stimulation occurred to more removed body parts
Discrimination
Pavlov investigated this first by giving the same exact tone paired with food hundreds of times. The dogs still tended to generalize.
Then he tried a more contrastive method where he would present a tone (say, 1000 Hz ) paired with food and he would alternately present a slightly different tone (e.g., 900 Hz) without food. Although the dogs at first show generalization, they gradually restricted their response to the more precise stimulus.
Spontaneous Recovery
But, crucially, the initial excitatory connection remains to some extent. That’s why, according to Pavlov, it’s easier to reactivate.
Contiguity theory of classical conditioning
Whether an association is formed between two events depends on how close they are in time.
posits that classical conditioning is effective only when the conditioned stimulus and unconditioned stimulus follow one another closely in time
How does blocking pose a problem for the contiguity theory?
Blocking occurs because the new neutral stimulus is irrelevant to the prediction. The unconditioned stimulus is already predicted given the conditioned stimulus. No matter how close in time the new stimulus is presented, it will not form an association because the individual has already associated the original conditioned stimulus with the conditioned response.
What is an intuitive explanation for why blocking doesn’t generate classical conditioning?
First, the animal is trying to identify predictive cues. Second, once a predictive cue is known, there is no need to continue trying to identify other predictors that happen at the same time.
Rescorla Wagner model
Basic idea: R & W proposed that the strength of conditioning depends on the degree of surprise; learning rate differs in different conditions (e.g. depending on the intensity of shock)
Garcia effect
phenomenon in which conditioned taste aversions develop after a specific food becomes associated with a negative reaction, such as nausea or vomiting
tastes were preferentially associated with sickness and auditory and visual cues were preferentially associated with shock
Latent learning
type of observational learning where the learner doesn’t have to participate in a lesson for learning to occur
Model-Based learning
learning to attain a goal, learning a value of the outcome you are getting
The models learn what the effect is going to be of taking an particular action in a particular state.
Model-Free learning
not trying to get value but more based on habit, more automatic
For instance, after getting food from pushing a blue lever several times, the reinforcement learning system might come to assign a positive value to the action of pushing a blue lever, with no foresight.
We do not explicitly learn transition probabilities or reward functions. We only try to learn the Q-values of actions, or only learn the policy. Essentially, we just learn the mapping from states to actions, maybe modelling how much we’re expecting to get in the long run. The algorithm learns directly when to take what action.
The elements of Bayes’ theorem
Posterior Probability -> P(A|B) - probability of A being true given that B is true
Likelihood -> P(B|A) - probability that we would find B given that A were true
Prior Probability -> P(A) - probability that A is true (same for P(B))
Features of human word learning
A few instances often adequate
Easily learn extensions of overlapping categories
Learn without negative evidence
The size principle
“the preference for smaller consistent hypotheses over larger hypotheses increases exponentially with the number of examples, and the most restrictive consistent hypothesis is strongly favored”
This principle follows from Bayesian techniques and provides a way of explaining such learning
As the number of times of something occurring increases, argument is stronger; Increasing the size of a certain outcome, decreases the size of having other alternatives
Explain how the size principle predicts the pattern we see in word learning
If the only example of a “fep” is a Dalmatian, that should provide support for interpreting “fep” as Dalmatian but also dog. But if shown 3 examples of “feps” and all are Dalmatians, that should provide stronger support for interpreting “fep” as Dalmatian rather than dog.
Overhypothesis
assume that there would be only be green marbles in the bag containing a green marble and subsequently other bags will contain uniform colored marbles
Episodic memory
memory for particular experiences that actually happened
Semantic memory
memory for general knowledge/facts
Working memory
retention 15-30 sec; capacity is limited (7 +/- 2 novel units); unrehearsed info is lost
Components of Working Memory
central executive (attention control system), visuospatial sketchpad (visual-spatial working memory), phonological loop (verbal working memory)
All accommodates different types of info we encounter, all for processing diff information ex: visual, spatial, verbal
What kind of events interfere with consolidation and reconsolidation?
Stress, new information, emotional experiences
Consolidation (cellular/synaptic)
process of going short term to long term memory
Process depends on neurons generating new proteins
Reconsolidation
reactivation of memory
Turning inactive memories to active state, form of “updating”