Cognition & Learning Flashcards
Schedules of Reinforcement
Schedules of reinforcement refer to operant conditioning in which an organism learns that the consequence of a response influences the future production of that response. Reinforcement schedules refer to the different patterns of reinforcement given to an organism. Ratio schedules give reinforcement after a certain number of behaviors and interval schedules give reinforcement after a certain amount of time has passed. Unlike continuous reinforcement, if an organism has been trained on variable-ratio or variable-interval reinforcement, the behavior is difficult to extinguish due to being reinforced to wait for long periods of time with no response, making them persistent.
Fixed-ratio -> Reinforcer occurs after every nth response, >1. E.g., in a fixed-ratio 5 schedule every fifth response is reinforced.
Variable-ratio -> Reinforcer occurs after an nth response, but that number varies unpredictably around an average. E.g., in a variable-ratio 5 schedule reinforcement might come after 7 responses on one trial, after 3 on another, and so on, in random manner, but the average number of responses required for reinforcement would be 5.
Fixed-interval -> Reinforcer occurs after a fixed period in time elapses between one reinforced response and the next. E.g., in a fixed-interval 30-second schedule the first response occurs at least 30 seconds after the last reinforcer is reinforced.
Variable-interval -> Reinforcer occurs after an interval time between reinforced responses, but that time period varies predictably around an average. E.g., in a variable-interval 30-second schedule the average period required before the next response will be reinforced is 30 seconds.
Discriminative Stimulus
In operant conditioning (OC), a discriminative stimulus is one that is specifically present when reinforcement occurs so that subjects learn to respond only when it is present. In OC the discriminant stimulus is presented before the response and reinforcer. For example, Tone -> pigeon presses lever -> seeds. In this case, the tone is the discriminant stimulus that indicates the likelihood that a particular response will be reinforced. Subjects may have a difficult time discriminating between closely related stimuli, similar to generalization; however, this makes discriminative stimuli useful when studying animals’ and humans’ abilities to form sensory processes and cognitive concepts. For instance, studies in infants exhibit responses to stories they heard in utero by responding with increased or decreased sucking behavior, indicating they can hear differences between distinct discriminative stimuli despite being unable to explain the differences using words. On the other hand, pigeons have been shown to respond equally to similar discriminative stimuli, such as pictures of slightly different trees, indicating pigeons have formed a mental concept of trees.
Episodic vs. Semantic Memory
Within long term memory, two types of memory systems are contained: explicit/declarative and implicit/nondeclarative. Within explicit memory, which is considered to contain content of conscious thought about events and can be called back to the forefront of one’s mind, are episodic and semantic memories. Episodic memories contain specifics about one’s past experiences including events, feelings during a specific episode, and its contents. Inherent to episodic memories is one’s own derived meaning and perspective of the event, which is why they are referred to as autobiographical memories. On the other hand, semantic memories are not tied to specific past experiences but instead of general facts and knowledge including word meanings, ideas and schemas about the world, and general information. Due to this, when semantic memories are recalled, they do not depend on remembering specific experiences that led to one learning a certain fact. For example, an individual might know through their lifetime experience training dogs that they come in multiple coats of color (e.g., brown, black, gray, tan, white) but is unable to, and does not need to, remember the specific instances in which they learned this information to understand that they simply know it. Thus, these differences between episodic and semantic memory are why remembering episodes of events is more difficult and less stable than the general acquired information one gains over time.
Equipotentiality versus Preparedness or Belongingness in Classical Conditioning
According to research, equipotentiality among learning in classical conditioning posts that the laws of learning should remain consistent regardless of the reinforcements or situations used. Equipotentiality proposes that all forms of associative learning, either classical or operant, involve the same underlying principle: the rate of learning is independent of the combination of stimuli, reinforcer, or responses involved in conditioning. In classical conditioning, any conditioned stimulus has equal potential to become associated with any unconditioned stimulus. In operant conditioning, any response can be strengthened by any reinforcer. Equipotentiality was important for behavior theorists because their goal was to construct a general process learning theory that could be applied equally across all conditions.
Preparedness is the notion in classical conditioning that it is easier for certain species to associate certain stimuli, responses, and reinforcers depending on a biological basis. For instance, this posits that it is easier to induce phobias related to survival than any other kinds of fears due to humans’ evolutionary past. In other words, organisms learn to fear environmental threats faster to survive and reproduce more often due to this preparedness conditioning. Certain CS/US pairings are more adaptive to survival and are thus more easily formed. For example, we are prepared to fear bears and snakes, not necessarily clowns. Therefore, bears and snakes eliciting fear occur quicker than clowns eliciting fear.
Classical Conditioning
Classical conditioning was posited by Pavlov as a learning process involving reflexes mediated by the nervous system (e.g., falling down and stopping oneself). In classical conditioning, an unconditioned stimulus (a stimulus that automatically elicits a response e.g. food eliciting salivation) elicits an unconditional response (e.g. salivating). When a neutral stimulus is paired with the unconditioned stimulus, it is termed the conditioned stimulus if it comes to elicit a similar unconditioned response, now termed the conditioned response. The pairing must occur so that the CS occurs before the US because the CS can provide information regarding the availability of the US. The CR often mimics the UR, but they may differ. Generalization can occur in which stimuli similar to the CS elicit the CR. For example, Little Albert learned to fear (CR) a white rabbit (CS). Eventually, other white fluffy things elicited a fear response (CR) as well. Generalization can be extinguished through discrimination training in which one conditioned stimulus is no longer presented with the unconditioned stimulus (e.g., a bell and a ringing in which the ringing stops being paired with food).
Operant Conditioning
Operant conditioning, or instrumental conditioning, coined by B. F. Skinner refers to the learning process by which the consequence of a response influences the future rate of that response based on the favorability of the consequence (i.e., favorable [social approval] = increase in response, unfavorable [social rejection] = decrease in response). Operant conditioning utilizes a reinforcer which is a stimulus that follows a response and increases the subsequent frequency of that response. Skinner posited that every part of the human life can include operant responses that have been previously reinforced, whether it is conscious like tipping the waitress to received better service or unconscious like gravitating towards sugary foods because of the subsequent sugar high. Operant responses can also be shaped, like in animal training, but providing reinforcement to closer approximations of the desired response until it finally occurs and can be reinforced. Reinforcement can be administered on a continuous or partial reinforcement schedule as well as on a specific schedule including fixed interval, fixed ratio, variable interval, or variable ratio.
Declarative vs. Non-Declarative Memory (i.e., explicit vs implicit; BIGGER picture of LTM)
Within long term memory, two types of memory systems are contained: explicit/declarative and implicit/nondeclarative. Within explicit memory, which is considered to contain content of conscious thought about events and can be called back to the forefront of one’s mind, whereas within implicit memory, the information cannot be verbalized (i.e., non-declarative) as the memories drive actions and thoughts without the need for conscious attention. Explicit memory contains episodic and semantic memories while implicit contains procedural memory and priming. While explicit memories can be verbally recalled or come to consciousness, implicit memories cannot, but can be examined through behavioral responses on tests (e.g., a researcher asks an individual to swim, but not how they learned how to swim). Episodic memory: Explicit memory of your own past experiences; autobiographical. E.g., Your memory of what you did on your 16th birthday. Semantic memory: explicit memory that is not tied to a particular past experience. E.g., knowledge of word definitions, facts, ideas, schemas and general knowledge. Procedural memory: motor skills, habits, and unconsciously learned (tacit) rules. Practice improves the skill although the memory is unconscious. Procedural memory learning tends to be slower and is more generalized to other stimuli. E.g., getting better at riding a bike or hammering a nail. In procedural memory the skills are retained without conscious awareness of muscular change. Priming: is considered to be the unconscious sensory activation of long-term memory information so that relevant information is easily retrievable into consciousness (e.g., it is easier to identify a picture as a cat if a similar picture of a cat has been presented previously). This process allows for a stream of thought and as a link between explicit and implicit memory.
Habituation and Sensitization
Habituation is the decline in magnitude of a reflexive response when the stimulus is repeated several times in succession. For example, when a loud noise is presented, one might jump the first time, but each time the sound is repeated and nothing bad happens, they respond less and soon show no visible response at all. Habituation is one of the simplest forms of learning. It does not produce a new stimulus-response sequence but only weakens an already existing one. On the other hand, sensitization posits that an increment in responses occurs with more frequent stimulus presentation, likely due to the organism becoming more sensitive to the stimulus. Research proposes that this likely occurs after emotional stimuli and only for a short period of time.
Exposure treatments involve both habituation and extinction processes by presenting a feared stimulus over time for a prolonged period where no harm comes to the client. If it is an unconditioned fear reflex (e.g., fear of snakes but never encountered one) being treated, this process involves habituation; if it is a conditioned fear reflex (e.g., bitten by a deadly snake), the process involves extinction.
Levels or Depths of Processing (Top-Down vs. Bottom-Up)
Top-down processing refers to how one’s perception is influenced by their expectations and background knowledge (sensory, previous experience) rather than stimulus itself. Top-down processing is theory driven to shape cognitive understanding. In other words, top-down processing involves the combination of sensory information and information from previous experiences involved in the context of that stimulus in order to conclude/perceive something about it. For example, an individual walking in the forest knows to avoid brightly colored insects and animals that might be poisonous or venomous.
Bottom-up processing refers to mental processes that take a stimulus and its features from the senses to form an overall perception of the scene. Bottom-up processing is data-driven and directs cognitive awareness. In other words, sensory information is processed and analyzed into something that makes sense. For example, an individual might perceive something visually and attempt to make sense of the new species being viewed.
Exposure treatments involve both habituation and extinction processes by presenting a feared stimulus over time for a prolonged period where no harm comes to the client. If it is an unconditioned fear reflex (e.g., fear of snakes but never encountered one) being treated, this process involves habituation; if it is a conditioned fear reflex (e.g., bitten by a deadly snake), the process involves extinction.
Law of Effect (Thorndike)
The law of effect refers to responses that produce a satisfying effect in a particular situation become more likely to occur again in that situation, and responses that produce a discomforting effect become less likely to occur again in that situation. Thorndike’s law of effect differs from classical conditioning in that the animals he used had some control over their environment and actions compared to Pavlov’s dogs who could only rely on the different stimuli to tell them when food would come. Thorndike’s cats were able to access food under their own effort. He put hungry cats in a puzzle box with food just on the outside of the box. When first placed inside the box, the cats would engage in many actions and then finally, often by accident, the cat would press the lever to open the door to the food and freedom. Thorndike repeated this process many times and found that in early trials the cats made many useless movements before happening upon the one that released them from the puzzle box. But, on average they escaped more quickly with each successive trial. Once released, the satisfying effect, including freedom from the box and access to food, caused the response to strengthen. Therefore, according to the law of effect, the next time the cat was placed in the box, the probability of that response’s recurrence was increased.
Multi-store Architecture of Memory
Cognitive theories of information processing are not all-encompassing but instead based on assumptions about how humans acquire, store, and retrieve information. First, it is assumed that humans have limited mental resources to process information. Second, it is assumed that information moves through a ‘system of stores’ in order to bring in sensory information, manipulate it, store it, and retrieve it when needed. Three storage systems are proposed: sensory memory, short-term/working memory, and long-term memory. All are characterized by their function, capacity, and duration, and all three are broadly governed by control processes including attention, rehearsal, encoding, and retrieval.
Sensory memory – Separate sensory stores are thought to exist for each sense (e.g., visual storage). Regardless of attention, sensory stores likely briefly hold all sensory input in its original form to be analyzed unconsciously to decide whether to bring the information into the short-term store. Only sensory information that is transformed selectively through attentional processes go into short-term store/working memory.
In the short-term store, information fades very quickly especially when the information is no longer attended to or thought about. Too, it can hold approximately seven items +- two, as only a few pieces of information can be thought about at the same time. This storage is referred to as working memory given that it is responsible for storing and transforming the information being held as well as conscious perceptual processes. Information can enter the short-term store from sensory and long-term stores. Ultimately, it is responsible for the flow of cognitive processes in the brain.
In the long-term store, representations of knowledge are held for recognition and recall when encountering day-to-day information. It is assumed the capacity of long-term storage is large, and people are largely unaware of its specific contents unless activated and moved into short-term storage.
Some incorrect assumptions are made about the memory stores in this model. First, it is incorrect to assume that the short-term store lies as a gateway between the other two, as long-term storage can provide immediate information before short-term store has an opportunity to process. Second, the model is deemed too simplistic; for instance, not all information is consciously perceived and processed (e.g., implicit learning) in short-term stores, and not all items held are equal. Finally, very little information stored in long-term memory is actually rehearsed.
Extinction
In classical conditioning, in order to remove the association between the CS and the CR, the CS must be presented without the presence of the US repeatedly (e.g., present the bell without the food to extinguish salivation being paired with the bell). Without the CS signaling the US, the CR will cease. Research suggests that this pairing is not lost during extinction but rather inhibited. One can bring the CR back by presenting the CS-US pairing after time has passed (spontaneous recovery). In operant conditioning, an operant response declines in rate and eventually disappears if it is no longer being reinforced. Thus, the absence of reinforcement of the response and the consequent declining in response rate both lead to extinction. However, just like classical conditioning, spontaneous recovery can happen, such that a single reinforced response following extinction can lead the individual to respond again at a rapid rate.
Evidence that classical conditioning depends on predictive value of CS
While Pavlov posited a behaviorist theory in classical conditioning, some researchers argue for a stimulus-stimulus theory that is more cognitive in nature. More specifically, the expectancy theory explains how a conditioned response can look very different from an unconditioned response. For example, additional responses to food in the dog example (e.g., salivating and wagging tail, begging for food) likely occur because the dog expects the food because creatures naturally seek relationships between events/stimuli. In other words, these responses were not elicited by the unconditioned response originally, so some other explanation must be occurring. This theory is supported in multiple ways.
It is more effective if the conditioned stimulus precedes the unconditioned stimulus; in fact, conditioning does not occur if they are paired simultaneously or after the unconditioned stimulus. This is likely because an organism is trying to predict, and a stimulus that does not precede the unconditioned stimulus (e.g., food) is useless in predicting future events as the association simply won’t be made.
The conditioned stimulus must signal heightened probability of occurrence of the unconditioned stimulus. This means conditioning is dependent on both the total number of pairings and the number of times they occur without being paired, with conditioning strengthened in the former and weakened in the latter. It is likely that these probabilities are weighed against each other, and a conditioned response will only occur if the probability they will be paired is greater.
Finally, conditioning is ineffective when the animal already has a good predictor. In other words, if one conditioned stimulus already reliably precedes an unconditioned stimulus, a new stimulus, presented simultaneously with the original conditioned stimulus, generally does not become a conditioned stimulus. Even if presented multiple times, the new stimulus will not create a conditioned response on its own due to a blocking effect. This is likely because the organism can already cognitively predict the unconditioned stimulus and has no need for a new one. The blocking effect is only worked around when the unconditioned stimulus is larger in magnitude or different from the original (e.g., much larger and fresher dog food), so the new stimulus provides new information for prediction.
Reinforcement and Punishment
Primary concepts in B.F. Skinner’s operant conditioning theory are those of punishment and reinforcement. Broadly, reinforcement refers to an action the increases the likelihood of a response occurring again, while punishment refers to an action that decreases the likelihood of the response occurring again. Reinforcements can be either negative or positive. Negative reinforcement refers to the removal of a stimulus after a response to increase the response’s likelihood of occurring again (e.g., removing a shock). Positive reinforcement refers to the adding of a stimulus to increase the likelihood (e.g., allowance money). The terms indicate whether the reinforcing stimulus appears (positive) or disappears (negative) as a result of the operant response. Similarly, punishment can be described as either positive or negative. Positive punishment refers to adding a stimulus to decrease the likelihood of a response occurring again (e.g., spanking). Negative punishment refers to the removal of a stimulus to decrease the likelihood (e.g., taking away a cellphone). It is likely that the terms ‘negative reinforcement’ and ‘punishment’ are frequently confused as the former indicates something ‘negative/bad’ will occur.
Implicit Learning
Implicit learning refers to individuals’ ability to learn something ‘without realizing’, or without being fully conscious that learning is occurring. Essentially, the individual learns and modifies their behavior as a result of interacting and unintentionally learning a task/method. This process is automatic and quick. For example, learning a first language can be considered an implicit learning process as the skill gradually improves without awareness. It is thought that the striatum within the basal ganglia is responsible for implicit learning processes; this is corroborated by reports of impaired implicit learning in Parkinson’s patients who have damage to this brain region. Implicit learning, due to the nature of being out of conscious awareness, is difficult to assess in research as the process is difficult to express; but, testing has been successfully done in the serial reaction time task.