Planning Flashcards
Match the following names with their hypothesis. Note these are in chronological order of publication: Broca, Gross/Fuster, O’Keefe & Nadel, Ungerleider & Mishkin, Desimone & Duncan, Schultz & Dayan, Cohen & Botvinick
Modular brain, hierarchical brain, cognitive map, parallel streams, biased competition theory, dopamine, response conflict
This lecture was called planning and reasoning because the reasoning aspect is essentially all about reasoning…
Inductively i.e. inferring probable conclusions from premises
If we argue that reinforcement learning drives all behaviour then we are reducing an agent’s brain down to…Does this agent have knowledge e.g…?
A gigantic table of values of different actions in a given state. E.g. which states follow others. No
Tolman (1946) presented evidence that animals do have knowledge in form of cognitive maps. How was this demonstrated?
A rat was trained to enter a tunnel which led to a reward. When this tunnel was blocked at test the rat chose to take the most appropriate alternative tunnel given the inferred angle of of the rewarding location from the starting position
Give another example from an animal cognition in which reinforcement learning does not suffice and knowledge of the order of states must be known
Aesop’s thirsty crow places stones in a jug to raise the water level so that he can reach/drink the water in the jug. The crow needs knowledge about the links between states which lead to the reward
Give a third example from animal cognition in which knowledge is used to plan for the future (Raby, 2007)
Western scrub jays experience being given food for breakfast in compartment A but not in C. Then when given food in B with free access to neighbouring A & C in the evening, they cache (store/save) the food in C not A = should they be placed in C the next morning breakfast will be available
What is the benefit of having knowledge of the transitions between states e.g. in the form of a cognitive map?
It allows you to learn about rewards offline
Which part of the brain do we use to learn about links between states? TSB Summerfield (2006) who found that…
MTL. The degree of MTL activity during associative encoding predicted the success of recall of which house went with which scene (= the blue trace on the fMRI signal graphs)
In the real world we often do not intend to learn associations between stimuli. Nevertheless we still learn associations. The neural mechanism for this implicit associative encoding was demonstrated by Schapiro (2012). What were the stimuli?
Pps were presented with a recursive structured series of colourful stimuli. Some stimuli nearly always followed each other (strong pairs), whilst other stimuli followed each other at just above-chance (weak pairs).
What did Schapiro (2012) find re: the hippocampus and implicit associative encoding? What does the middle bar of the 3 on Schapiro’s result graph depict?
MVPA showed that strong pair members showed greater hippocampal pattern similarity than weak pair members despite no difference in visual similarity. Shuffled pairs (strong to the left & weak to the right)
What was found from training an agent to move from X to Y with vs. without a planning algorithm called _ _ _ _?
DYNA. Without planning the agent becomes stuck. With planning on the basis of learnt transitions between states, agents show a much improved ability to reach the goal location
What evidence is there of offline learning i.e. whilst not engaged in the activity & instead resting?
After rats have run around a circular apparatus, CA1 place cells in the hippocampus fire in the same but more rapid sequence as during the actual activity
What do types of offline learning exist? One refers to planning and the other “reflecting”. When do they occur?
Replay of cellular activity after the experience. Preplay before the action has been performed. During sleep or quiet resting
As well as occurring in the hippocampus, replay also occurs in _ _ _. Sequences are sometimes replayed ___ as if…
PFC. Backwards as if learning begins at the goal so that the rewarding purpose of an action is propagated backwards through a series of states back to the starting point. This allows us to plan how to reach this goal again in the future
Uncertainty based competition theory posits 2 competing RL (response learning) mechanisms which link sensation to action - what are they? Where do they likely lie at the neural level?
Habit-based, model free RL vs. goal- & model based RL. Basal ganglia vs. PFC