Book Questions Flashcards
Describe the two ways used to conceptualize light.
One way is to think of it as a wave that travels through a medium. Another is to think of it as a stream of photons, tiny particles, each consisting of one quantum of energy.
Describe the difference between light that is reflected and light that is transmitted.
Reflected light occurs when a ray of light strikes a light-colored surface and then bounces back towards its point of origin. Transmitted light occurs when light is neither reflected nor absorbed by a surface. An example is a transparent window; light passes through the surface and is transmitted to the other side.
What is the purpose of the cornea?
The cornea is a transparent surface on the exterior of the eye. It protects the eye from the outside world. Being transparent, it allows light to be transmitted through it and into the eye.
What is the purpose of the retina?
The retina is a light-sensitive membrane in the back of the eye that contains rods and cones, which receive an image from the lens and send it to the brain through the optic nerve.
How does the process of accommodation take place in the eye?
Accommodation takes place in the lens of the eye. The lens changes its refractive power by changing its shape. This causes the eye to be able to focus on a given object, whether it is near or far.
What is astigmatism and how can it be fixed?
Astigmatism is a visual defect caused by the unequal curving of one or more of the refractive surfaces of the eye, usually the cornea. It can be fixed by wearing lenses that have two focal points (that provide different amounts of focusing power in the horizontal and vertical planes).
Why are photoreceptors important in the process of seeing?
Photoreceptors are the cells that make up the backmost layer of the retina. They are sensitive to light, and as soon as they sense it, they can cause neurons in the intermediate layers to fire action potentials. Photoreceptors are important in the process of seeing because they transduce the physical energy of light into neural energy that our brains can analyze.
What are rods and cones?
Rods and cones are photoreceptors present in the retina. Rods are specialized for night vision, while cones are specialized for daylight vision, fine visual acuity, and color.
Explain what happens in the process of hyperpolarization.
Hyperpolarization is an increase in membrane potential in which the inner membrane surface becomes more negative than the outer membrane surface. This process is one in a sequence of events that occur once light is sensed by the photoreceptors
Why can’t rods signal differences in color?
Rods cannot signal differences in color because they only have one type of photopigment. Cones, on the other hand, have three types of photopigments, which help them differentiate between colors.
Describe age-related macular degeneration (AMD) and how it affects one’s vision.
AMD is a disease that affects the macula, gradually destroying high resolution central vision. This degeneration in vision makes it difficult to read, drive, or recognize faces.
What is the difference between wet and dry age-related macular degeneration (AMD)?
In wet AMD, abnormal blood vessels behind the retina start to grow under the macula and leak blood and fluid. The blood and fluid raise the macula and cause a rapid loss of central vision. In dry AMD (the more common form), cones in the macula degenerate slowly over time. Once dry AMD becomes too advanced, vision loss becomes permanent, however there are treatments to prevent the advancement of dry AMD.
What is the role of horizontal cells?
Horizontal cells are specialized retinal cells that contact both photoreceptors and bipolar cells. They produce lateral inhibition, which allows the signals that reach retinal ganglion cells to be based on differences in activations between nearby photoreceptors rather than absolute levels of activation.
What is visual acuity?
Visual acuity is a measure of the finest detail that one can resolve.
What is the difference between an “ON” bipolar cell and an “OFF” bipolar cell?
An “ON” bipolar cell is a cone bipolar cell that depolarizes in response to an increase in light intensity. An “OFF” bipolar cell is a cone bipolar cell that depolarizes in response to a decrease in light intensity. These two cells have opposite reactions to light.
What is a receptive field?
A receptive field is the region on the retina in which stimuli will activate a neuron. Receptive fields vary in size, shape, and complexity.
Why is the center–surround organization of retinal ganglion cells so important?
The center–surround organization of retinal ganglion cells is important because it allows for sensitivity to contrast rather than absolute illumination levels. Ganglion cells are most sensitive to differences in the intensity of light in the center and in the surround, and they are relatively unaffected by the average intensity of light. This is useful because the average intensity of light falling on the retina will be quite variable, depending on whether the observer is indoors, outdoors, etc., but contrasts of light are relatively constant.
What is a filter and how is it important in vision?
A filter is an acoustic, electrical, electronic, biological, or optical device, instrument, or computer program that allows the passage of some frequencies or digital elements and blocks others. Filters are important in vision because they allow the transformation of raw images into representations in the brain. Filters highlight certain important visual information while eliminating other unimportant information. The center–surround receptive fields of retinal ganglion cells are filters.
What are some consequences of the differing sizes of M ganglion cell and P ganglion cell receptive fields?
P ganglion cells have smaller receptive fields than M ganglion cells at all eccentricities. This allows the M ganglion cells to respond to a larger portion of the visual field. In addition, they are much more sensitive to visual stimuli under low lighting conditions than P ganglion cells. P ganglion cells, on the other hand, provide finer resolution (greater acuity) than M ganglion cells, as long as there is enough light for them to operate.
Explain how the pupil adapts to dark and light conditions.
The pupil has the ability to dilate and constrict, depending on amount of light. For example, under well-lit conditions, the pupil tends to constrict to let less light into the eye. Under dark conditions, the pupil dilates to allow more light into the eye.
Explain how the pupil adapts to dark and light conditions.
The pupil has the ability to dilate and constrict, depending on amount of light. For example, under well-lit conditions, the pupil tends to constrict to let less light into the eye. Under dark conditions, the pupil dilates to allow more light into the eye.
Explain why it is that we are generally not bothered by variations in overall light levels.
We are generally not bothered by variations in overall light levels because we have several mechanisms for regulating how much light enters the eye. One mechanism is the pupil size. Another is the regeneration rates of pigments in our photoreceptors. Yet another is the rod/cone dichotomy—cones operate at moderate and high light levels while rods take over for low light levels. Finally, the neural circuitry of the retina itself helps stabilize external light variations by emphasizing contrasts in luminance rather than absolute light levels.
What is visual acuity and how can it be measured?
Visual acuity is the smallest spatial detail that can be seen accurately. It can be measured by doing a visual acuity test, which requires looking at figures from a distance and identifying them.
Explain what happens during the phenomenon of aliasing.
Aliasing is the misperception of a grating due to undersampling. When looking at gratings, the visual system “samples” the grating discretely via the array of receptors at the back of the retina. If the receptors are spaced such that the lightest and darkest parts of the grating fall on separate cones, the observer can detect the grating. However, if the lightest and darkest parts of the grating both fall on the same cones, then the grating will be aliased and appear gray.
Explain the meaning of being able to see 20/20.
Being able to see 20/20 means that the observer can identify an object at 20 feet as well as a “normal” observer would be able to identify it at 20 feet. If the observer’s vision is 20/40, that means that the observer can see at 20 feet what somebody with normal vision can see at 40 feet (meaning the observer needs glasses!).
What can we infer from the contrast sensitivity function?
The contrast sensitivity function describes our window of visibility. Any object whose spatial frequencies and contrast fall within the region specified by the contrast sensitivity function will be visible. Those objects outside the region are outside our window of visibility. We can infer from this function that sensitivity to contrast depends on the spatial frequency of the stimulus.
Explain how retinal ganglion cells respond to stripes.
Each ganglion cell responds to certain types of stripes or gratings. For instance, an ON ganglion cell responds to gratings with spatial frequencies and phases that make the lightest part of the grating fall on the center of the cell and the darkest part of the grating fall on the surround. When the spatial frequency of the grating is too low, the ganglion cell responds weakly because part of the bar of the grating lands in the inhibitory surround, dampening the cell’s response. Similarly, when the grating’s spatial frequency is too high, the ganglion cell responds weakly because both dark and light stripes fall within the receptive field’s center and surround, washing out the response. When the frequency is just right, the cell responds vigorously.
What is the role of the lateral geniculate nucleus?
The lateral geniculate nucleus is a nucleus in the midbrain that shares connections with both the retina and visual cortex.
What are the two types of layers in the LGN and how are they different from each other?
The two types of layers in the LGN are the magnocellular layers and the parvocellular layers. The magnocellular layers are the two bottom layers of the LGN, and contain neurons that are physically larger than those in the parvocellular layers. Neurons in these layers respond to large, fast-moving objects. The parvocellular layers are the top four layers of the LGN. They contain neurons that respond to details of stationary objects.
Explain the notion of topographical mapping.
Topographical mapping is the orderly mapping of the world in the lateral geniculate nucleus and the visual cortex. Points of light that are near each other in the world fall on parts of the retina that are near each other and will be processed by neurons that are near each other in the brain. This orderly representation provides us with a neural basis of knowing where things are in space.
What are two important features of the visual cortex? Explain.
One important feature of the visual cortex is topographical mapping, which is the orderly mapping of the world in the brain. The second feature is the dramatic scaling of information from different parts of the visual field. Objects on or near the fovea are processed by neurons in a large part of the striate cortex, whereas objects imaged in the periphery are allocated a much smaller portion of the striate cortex. This feature is known as cortical magnification.
What is orientation tuning?
Orientation tuning is the tendency of neurons in striate cortex to respond optimally to certain orientations, and less to others.
In what way do striate cortex neurons function as filters?
Each striate cortex neuron responds to a particular location and is tuned to a particular spatial frequency, orientation, and phase. These narrow tuning functions mean that each striate cortex neuron functions as a filter for the portion of the image that excites the cell.
What is ocular dominance?
Ocular dominance is the property of the receptive fields of striate cortex neurons by which they respond more vigorously when a stimulus is presented in one eye than when it is presented in the other.
What is the difference between simple and complex cells?
Simple cells are cortical neurons with clearly defined excitatory and inhibitory regions, whereas complex cells are neurons whose receptive field responds to any properly oriented bar of light, regardless of whether it is light or dark.
What is the role of end stopping?
End stopping refers to a property of certain cortical neurons in which they respond vigorously when the end of a bar of light falls within their receptive field. It plays an important role in our ability to detect luminance boundaries and discontinuities.
What does a hypercolumn contain?
A hypercolumn contains at least two sets of columns, each covering every possible orientation, with one set preferring input from the left eye and one set preferring input from the right eye.
What is the enzyme cytochrome oxidase (CO) used for?
This enzyme is used to reveal the regular array of “CO blobs,” which are spaced about 0.5 mm apart in the primary visual cortex. These blobs have been implicated in processing color, motion, and spatial structure.
How can adaptation provide insights into the properties of cortical neurons?
Adaptation is the diminishing response of a sense organ to a sustained stimulus. It is helpful in learning about the properties of cortical neurons because, by exposing an observer to a particular stimulus for an extended period of time, the experimenter can make inferences about the visual system due to the observer’s changing responses. If two stimuli are processed by unrelated sets of neurons, then selectively adapting one set of neurons should have no effect on the other set.
What idea does the tilt aftereffect support?
The tilt aftereffect supports the idea that the human visual system contains individual neurons selective for different orientations.
What are spatial-frequency channels?
Spatial-frequency channels are pattern analyzers, implemented by ensembles of cortical neurons, with each set of neurons tuned to a limited range of spatial frequencies.
How do psychologists study visual processing in infants?
Infants tend to look more at complex scenes than at simple scenes. If presented with the choice of looking at a series of stripes or a uniform gray field, infants will look more often and longer at the stripes. If the stripes are low contrast and the baby cannot see the difference between the stripes and the gray field, he or she will stare equally often at the two stimuli. Thus, through careful observation of infants’ preferential looking, psychologists can tell which stimuli they can see and which they can’t.
What cortical brain structures does visual information pass through as it is processed?
Information first reaches the cortex in a region called striate cortex, so-called because it has a distinctive striped pattern under the microscope. Early vision processes are carried out here, then information is passed to extrastriate cortex, where the tasks of middle vision are carried out (for example, this is where illusory contours are processed). From here, information travels via two separate pathways, one that ends in the parietal lobe, and one that terminates in inferotemporal (IT; lower temporal lobe) cortex. It is in IT cortex that the end-stage processing of face and object recognition is carried out.
What are the receptive field characteristics of cells in IT cortex?
Many neurons in IT have been shown to respond most actively to particular objects or faces. The term “grandmother cell” was coined to describe these neurons, the implication being that a single cell might be ultimately responsible for deciding whether an image was of one’s grandmother’s face.
What methods are used to study the function of brain areas such as IT?
Some labs lesion (surgically remove) parts of the brains in nonhuman subjects to see what functions are impaired following the surgery. The results of such studies are often compared to deficits shown by human patients who have had homologous regions of their brains damaged by accident. Other labs use single-cell recording techniques to determine the responses of individual neurons to different types of stimuli (it was in these labs that grandmother cells were found). Recently, many resources have been poured into laboratories employing noninvasive techniques such as functional magnetic resonance imaging (fMRI), which can take snapshots of neural activity in human’s brains as they perform different tasks.
Why can’t we apply a simple rule like “homogeneous areas belong to the same object” in order to find an object’s contours?
Because humans sometimes perceive object contours even in areas of an image where there is no physical difference between the object and its background (see Figure 4.9).
Draw a figure that includes an illusory contour.
An illusory contour is one that is perceived even though it is not present in the physical stimulus. The Kanisza triangle (left) is one famous example; another illusory contour is shown below.
What is the guiding philosophy behind Gestalt psychology? How does it contrast with the earlier approach known as structuralism?
The structuralists believed that perception of a complex scene was simply the sum of the basic “atoms” of perception (color, orientation, etc.) in the scene. Gestalt psychologists reacted to this position, arguing that a perceptual whole was much more than the sum of its elemental parts.
What do the Gestalt grouping principles seek to describe?
The grouping principles provide rules for how different individual elements in an image tend to be combined by the visual system into wholes (i.e., objects).
Why is it important to include the phrase “all else being equal” when stating the Gestalt grouping principles?
Because we can only be absolutely sure that a principle will adequately predict how elements will be grouped if no other principles can also be applied. For example, at right we see a display in which the proximity grouping principle would suggest that we organize the elements into four columns, while the similarity principle suggests we should perceive five rows. Only one principle can “win” (in this case, most people probably see rows rather than columns).
How are the Gestalt grouping principles related to texture segmentation?
A texture is really just a collection of many perceptual elements that are similar to each other and arranged fairly close together. Therefore, stating that areas of an image with different textures are segmented from each other (the definition of texture segmentation) is really the same thing as saying that areas of an image in which elements are similar to each other and/or close together group together.
How is camouflage related to grouping principles?
To camouflage yourself, you have to make your features (that is, the visual elements that are visible to anyone who might observe you) group with the features present in your environment.
What is the basic idea behind the “perception by committee” metaphor?
The visual world is a complicated place, and no one rule for interpreting the world can possibly do an adequate job. But once we introduce multiple rules, conflicts between interpretations will inevitably arise. Various parts of our visual system act like perceptual committees, considering which rules conflict and which agree in a given situation and eventually arriving at a single interpretation for the scene.
What are ambiguous figures, and how do they relate to the perception by committee metaphor?
Ambiguous figures, such as the Necker cube seen at left, have more than one valid interpretation. Our perceptual committees settle on one and only one of these interpretations at a time, but the interpretation may “flip” from time to time.
What are some of the assumptions that perceptual committees make?
First, the committees must “know” something about physics; for example, understanding that opaque objects block light is a prerequisite for perceiving the illusory edges of the triangle in the Kanisza triangle. Second, the committees assume that we are not viewing a scene from an accidental viewpoint, which would mask the true structure of the objects in the scene.
What is figure–ground assignment?
The process of determining which areas of an image constitute a to-be-recognized object (the “figure”) and which areas form the background (the “ground”).
What is the notion of relatability and why is it important?
Relatability is the notion that line segments on either side of an occluding surface will look like they are part of a single object if they can be connected by a smooth curve that only bends once. This concept is important because it describes the constraints our brains use to fill in edge information that is missing from objects due to occlusion.
What do nonaccidental features tell us about a scene?
Certain arrangements of edges can be interpreted as providing important information about segmenting objects in a scene, provided we are seeing the edges from a nonaccidental viewpoint. For example, a “T-junction” (a place where one edge abuts another straight edge in a T-like fashion; the arrow in the figure at left points to a T-junction) strongly indicates that the two edges are parts of different objects.
What rules do our perceptual committees use to divide objects into parts?
One widely accepted proposal is that we use valleys, rather than bumps, in an object as clues to where to divide the object into parts, cutting the object by connecting pairs of valleys (see figure at left).
What evidence is there that the visual system starts with large objects and then divides them into smaller parts, rather than processing scenes the other way around?
Evidence for this proposition comes from the global superiority effect: In displays like those at the left, it was found that identifying the small (local) letters took longer than identifying the larger (global) letter, indicating that the global information is more readily available than the local information. That is, in the figure at left, you tend to see the E before the Gs.
What is the fundamental goal of object recognition?
To match a representation of a perceived visual stimulus to a representation of a previously-encountered object encoded in memory.
What is a naïve template theory, and why can such theories be rejected as a complete theory of object recognition?
The formal definition of a template is complicated, but template theories essentially follow a “lock and key” principle: The perceived image is the key, and the template is the lock. The naïve template approach says that we store templates for all the images of all the objects we have ever seen. When we perceive an object that we want to recognize, we try to match this perception to all the templates stored in memory until we find a lock in which the key fits exactly. This doesn’t strike most people as being a very efficient process. One of the most important problems is that it seems unlikely that we have enough brain capacity to store templates to match every single object that we are likely to encounter in our lives.
What is the basic idea behind a structural description, and how do structural description theories improve on template theories?
Straightforwardly enough, a structural description describes the structure of an object in a more abstract way than a template. Different theories propose different sets of building blocks with which to What is a geon?create the descriptions, and most structural description theories also propose some way to describe how parts are related to each other. The advantage over templates is that a single structural description can potentially match a large number of slightly different shapes. For example, if an X is described as two oblique lines that cross near their centers, this description will match all the figures at left; however, each would require a different template in a naïve template theory.
Describe the essence of the viewpoint invariance versus viewpoint dependence debate in the object recognition literature.
Many structural description theories, such as recognition by components, predict that in most circumstances, object recognition should be equally efficient (i.e., equally fast) regardless of what viewpoint you see the object from. Such a pattern of performance, in which recognition time does not vary across changes in viewpoints, is known as viewpoint invariance. However, many empirical studies have revealed that object recognition times are in fact dependent on viewpoint: If subjects study a novel object from a single viewpoint, they are usually slower at recognizing the object later when shown from a new viewpoint than when shown from the trained viewpoint. These findings have cast doubt on structural description theories and led to a resurgence of interest in theories that use template-like representations.
What do we mean when we say that objects can be recognized at different levels?
Object recognition is essentially a categorization process: Identifying an object means deciding what category the object belongs in. Most objects actually have a number of categories that they could be placed in. The level of recognition refers to the specificity of the category you use when identifying an object. For instance, you can recognize a chair as a “barber chair,” “chair,” or “furniture,” depending on what category you are using.
What are basic, subordinate, and superordinate categories?
These terms are best described in relation to each other. A subordinate level category is one that is quite specific, referring to a relatively small number of objects. A superordinate level category, on the other hand, is much more general; superordinate categories are often defined by functional or conceptual, rather than shape-based, qualities. Basic level categories are in between. Some examples of subordinate basic superordinate triplets are: schnauzer–dog–animal; office chair–chair–furniture; and iMac–computer–machine.
What is the difference between an entry level and a basic level category?
The entry level term for an object is operationally defined as the first word that comes to mind when someone is asked to name the object. The formal definition of the basic level is more complicated and somewhat more vague. Usually, an object’s entry level term is the same as its basic level term, but exceptions occur for strangely-shaped objects, such as penguins and bean bag chairs.
Why is face recognition thought be accomplished via different mechanisms than the recognition of other objects?
Most objects require considerably more time to recognize at the subordinate than at the basic level. However, recognition of individual faces, which is a subordinate-level task, is a very fast process—so fast that many researchers believe the visual system must use “special” mechanisms to recognize faces. Also, face recognition and object recognition can be doubly dissociated—people with object agnosia can recognize faces but not objects whereas people with prosopagnosia can recognize objects but not faces.
What is the inversion effect, and how does it relate to the special mechanisms thought to be operating when we recognize faces?
Faces are more difficult than other objects to recognize when inverted (turned upside-down). Researchers have proposed that when faces are inverted, the special processes that are usually brought to bear in recognizing faces cannot operate, so we are forced to rely on our “normal” object recognition processes, which are not as efficient for subordinate-level objects, like faces.
What is prosopagnosia, and what does it say about special face recognition processes?
Prosopagnosia is a neuropsychological disorder in which people cannot recognize faces, although they can recognize other objects normally. It is thought that this disorder is due to damage in the portion of the brain where special face recognition processes are carried out.
achromatic
Referring to any color that lacks a chromatic (hue) component. Black, white, or gray.
What is the problem of univariance?
The problem of univariance is the fact that an infinite set of different wavelength–intensity combinations can elicit exactly the same response from a single type of photoreceptor. One photoreceptor type cannot make accurate color discriminations based on wavelength.
Describe the three types of cones in the human visual system and explain the differences between them.
The three types of cones in the human visual system are: S-cones, M-cones, and L-cones. They are all collectively responsible for discriminating between different colors. The S-cones are preferentially sensitive to short wavelengths, the M-cones are preferentially sensitive to middle wavelengths, and the L-cones are preferentially sensitive to long wavelengths.
What does the trichromatic theory of color vision tell us about color perception?
The trichromatic theory of color vision tells us that the color of any light is defined in our visual system by the relationships between the outputs of the three cone types.
Why do metamers produce the same perceived color?
Metamers are different mixtures of wavelengths that nonetheless look identical. Even though the wavelength mixtures are different, they produce the same response from the cones in our visual system, which in turn causes the colors to appear identical.
What is an additive color mixture?
An additive color mixture is when two sources of illumination combine to make a new color, as when mixing lights. If light A and light B are both reflected from a surface to the eye, the colors of those two lights add together.
What is a subtractive color mixture?
A subtractive color mixture is when one source of illumination is subtracted from another, as when two color filters are placed in front of a light source or when pigments are mixed. If pigments A and B mix, some of the light shining on the surface will be subtracted by A, and some by B. Only the remainder contributes to the perception of color.
What happens if you shine “blue” and “yellow” lights on the same patch of paper?
If you shine “blue” and “yellow” lights on the same patch of paper, the wavelengths will add, producing an additive color mixture. Since “yellow” is equivalent to a mix of long and medium wavelengths, and “blue” consists of short wavelengths, the two lights will produce a mixture of short, medium, and long wavelengths. The resulting mixture will therefore look white.
Describe the idea of color space.
Color space is a three-dimensional representation of all possible colors. The color space has three dimensions because color perception is based on the outputs of three cone types.
What is the Young–Helmholtz theory?
The Young–Helmholtz theory is the trichromatic theory of color vision, which was developed in the nineteenth century by both Young and Helmholtz. It poses that any light is defined in our visual system by the relationships between a set of three numbers, which we now know to be the outputs of three receptor types (cones).
Explain how the LGN is important in color perception.
The LGN is a structure in the thalamus of the brain that receives input from retinal ganglion cells and has input and output connections to the visual cortex. Some of its cells are maximally stimulated by spots of light, which are critical to color perception.
What is a color-opponent cell?
A color-opponent cell is a neuron whose output is based on a difference between sets of cones.
What are the opponent color sets in the opponent color theory?
The opponent color sets in the opponent color theory are red versus green, blue versus yellow, and black versus white.
What is a unique hue? Provide an example.
A unique hue is a color that can be described with only a single color term. Red is an example of a unique hue, as opposed to orange, which can be described as a compound (reddish yellow).
What is a negative afterimage?
A negative afterimage is a type of afterimage whose polarity is the opposite of the original stimulus. For instance, light stimuli produce dark negative afterimages. Colors are complementary: red produces green afterimages and yellow produces blue afterimages. The negativity of the afterimages arises from the color-opponent cells.
Describe the method of “hue cancellation.”
The method of hue cancellation is used to demonstrate the opponent color theory. In this method, the experimenter might start with a light that appears to be a yellowish green. The experimenter then cancels the yellowness by adding its opponent color—blue. The experimenter then measures the amount of blue light needed to just remove all traces of the yellow.
What happens if the red–green and blue–yellow mechanisms are at their neutral points?
If these two sets of opponents are at their neutral points, the stimulus will appear achromatic.
What is achromatopsia?
Achromatopsia is an inability to perceive colors that is due to damage to the central nervous system.
In what way are color-anomalous individuals and cone monochromats color-blind?
Color-anomalous individuals are individuals that can make discriminations based on wavelength, but these discriminations are different from the normal. Cone monochromats are individuals with only one cone type, and therefore they cannot discriminate different colors, leading them to be truly color-blind.
What is cultural relativism?
Cultural relativism is the idea that basic perceptual experiences such as color perception may be determined in part by the cultural environment.
Describe the idea of color constancy.
Color constancy is the tendency of a surface to appear the same color under a fairly wide range of illuminants.
Describe two physical constraints that make constancy possible.
Luminance tends to change abruptly between surfaces and gradually within surfaces, so surface boundaries are an important physical constraint for achieving constancy. The fact that shadow boundaries change the brightness and not the chromatic properties of a surface is also an important physical constraint for constancy.
What is the advantage of binocular summation?
The advantage of binocular summation is that detecting a stimulus can be done with two eyes, as opposed to just one, and so this yields more information about the stimulus.
Explain the difference between a monocular depth cue and a binocular depth cue.
A monocular depth cue is available when the world is viewed with only one eye. A binocular depth cue requires information from both eyes.
What is the idea behind positivism?
Positivism is a philosophical position arguing that all you really have to go on is the evidence of your senses, so the world might be nothing more than an elaborate hallucination.
Name three monocular depth cues.
Any three of the following: occlusion, relative size, familiar size, relative height, texture gradients, linear perspective, aerial perspective, motion parallax, accommodation, or convergence.
Explain what a texture gradient is.
A texture gradient is a depth cue based on the geometric fact that items of the same size form smaller images when they are farther away. Thus, an array of items that change in size across the image will appear to form a surface in depth.
What kind of information does aerial perspective provide about the stimulus?
Aerial perspective is a depth cue that is based on the implicit understanding that light is scattered by the atmosphere. More light is scattered when you look through more atmosphere. Thus, more distant objects are subject to more scatter and appear fainter, bluer, and less distinct. Aerial perspective provides information about the relative distance of objects from the observer
What is a vanishing point?
A vanishing point is the apparent point at which parallel lines receding in depth converge.
What kind of movement does motion parallax depend on? Explain.
Motion parallax depends on either object movement or head movement. During either type of motion situation, closer objects move faster across the visual field than farther objects, allowing one to determine the depth of objects relative to each other.
What is a pictorial depth cue?
A pictorial depth cue is a cue to distance or depth used by artists to depict three-dimensional depth in two-dimensional pictures.
How are convergence and divergence important to depth perception?
Convergence and divergence are important to depth perception because they are used to place the two images of a feature in the world on corresponding locations in the two retinal images (typically on the fovea of each eye). They both reduce the disparity of that feature to zero, or nearly zero.
Explain the concept of corresponding retinal points.
Corresponding retinal points are points on the retina of each eye where the monocular retinal images of a single object are formed at the same distance from the fovea in each eye. The two foveas are also corresponding points.
Explain the concept of the Vieth-Müller circle, and how it relates to the horopter.
The Vieth-Müller circle refers to the location of objects whose images fall on geometrically corresponding points in the two retinas. If the two eyes are looking at one spot, then there will be a surface of zero disparity running through that spot (known as the horopter).
What is the difference between crossed disparity and uncrossed disparity?
Crossed disparity is the sign of disparity created by objects in front of the plane of fixation (the horopter). Images of objects that are located in front of the horopter will appear to be displaced to the left in the right eye, and to the right in the left eye. Uncrossed disparity is the sign of disparity created by objects behind the plane of fixation. Images of objects that are located behind the horopter will appear to be displaced to the right in the right eye, and to the left in the left eye.
What is a stereoscope?
A stereoscope is a device for presenting one image to one eye and another image to the other eye. Once these two images are fused by the observer, they create a single three-dimensional image with a strong impression of depth.
When is free fusion used?
Free fusion is a technique of converging or diverging the eyes in order to view a stereogram without a stereoscope.
What does stereoblindness often result from?
Stereoblindness, or the inability to make use of binocular disparity as a depth cue often results from a childhood visual disorder such as a strabismus, in which the two eyes are misaligned.
What is a random dot stereogram?
A random dot stereogram is a stereogram made of a large number of randomly placed dots. The random dot stereogram contains no monocular cues to depth.
When does one view cyclopean stimuli?
One views cyclopean stimuli when looking at random dot stereograms. These are stimuli that are defined by binocular disparity alone.
Define the correspondence problem.
The correspondence problem is the problem of figuring out which bit of the image in the left eye should be matched with which bit in the right eye.
Name two ways of solving the correspondence problem.
Any two of the following: 1) Blurring the image to remove high spatial frequencies, so that there are not as many dots to analyze and 2) using the uniqueness constraint, in which a feature in the world will be represented exactly once in each retinal image, or 3) using the continuity constraint which assumes that neighboring points in the world lie at similar distances from the viewer, except at the edges of objects.
What is binocular rivalry?
Binocular rivalry is the competition between the two eyes for control of visual perception, which is evident when completely different stimuli are presented to the two eyes.
In what sense does the Bayesian approach take into account past experience? Explain.
The Bayesian approach is a statistical model which states that prior knowledge could influence one’s estimates of the probability of a current event. In the case of vision, since the retinal images formed on the two retinas could be a result of an infinite number of scenes, this approach helps to narrow down the possible choices to the ones that are the most likely, based on past experiences.
Describe the idea behind the cueing paradigm.
This paradigm measures how fast a subject responds to a target appearing in one of two or more boxes under various cueing conditions in order to infer how attention might affect performance. Generally, there is a “valid cue,” an “invalid cue” and a “neutral cue.” A valid cue signals the correct location where the target will appear, whereas an invalid cue refers to a cue signals the incorrect location. A neutral cue is uninformative. The idea is that when an invalid cue is given, the observer will take longer to respond to the appearance of the target than when a valid cue is given. A neutral cue will result in a response that is slower than that of a valid cue case but quicker than that of an invalid cue case. This demonstrates that response to a target depends on prior attention to information in the visual field.
What is the “Spotlight” theory of attention?
It is a metaphor for attention based on the idea that attention can be moved from spot to spot in a manner similar to that of a spotlight beam.
How are visual search experiments useful for studying attention?
Visual search experiments provide a closer approximation to the actions of attention in the real world. The typical visual search experiment requires the observer to find a “target” item among some number of “distractor” items. This kind of search occurs regularly in the real world. For instance, looking for faces in a crowd, books on a shelf, etc.
Explain how a search can be inefficient.
When the target and distractors in a visual search task contain the same basic features, the search is inefficient. For example, if all the distractors in the task contain the color blue, and the target also contains the color blue, a search can be inefficient in that the observer may have to go through each item in the display in order to locate the target. This is contrasted with a situation in which the target stands out from the distractors, and as a result the observer does not need to attend to each stimulus in the display.
Describe one type of visual search that is efficient.
Feature search is an example of an efficient kind of visual search. In this case, the search for a target is defined by a single attribute, such as a salient color or orientation. For instance, imagine having to search for a red car in a parking lot full of white cars. The defining feature (the color red) is sufficiently salient, and it does not matter how many cars are in the parking lot. The red car stands out in the display. In this situation, we process the colors of all of the cars at once, or in parallel, making the search efficient.
Why is conjunction search less efficient than feature search?
Conjunction search is less efficient than feature search because conjunction search requires searching for a target defined by the presence of two or more attributes (e.g., a blue, horizontal target among blue vertical and green horizontal distractors), as opposed to feature search, which only requires searching for a target defined by a single attribute.
What is guided search?
Guided search is the idea that people find items they are looking for in the real world by restricting their attention to the subset of features that distinguish the target item from the rest of the items around it.
Describe Treisman’s feature integration theory.
This theory of visual attention states that a limited set of basic features can be processed in parallel preattentively, but that other properties, including the correct binding of features to objects, require attention.
What is the binding problem?
The binding problem is the challenge of tying different attributes of visual stimuli (e.g., color, orientation, motion), which are handled by different brain circuits, to the appropriate object so that we perceive a unified object (e.g., red, vertical, moving right).
How are illusory conjunctions a by-product of conjunction search?
An illusory conjunction is an erroneous combination of two features in a visual scene. For instance, seeing a red X when the display contains red letters and Xes, but no red Xes. This error can occur during a recognition task that involves conjunction search, when the observer tries to report which objects were present in a display of items. The observer confuses attributes of one object with attributes of another.
What are the two stages of feature integration theory?
The two stages of feature integration theory are: 1) The preattentive stage, which refers to the processing of stimuli that occurs before selective attention is deployed to any particular stimulus. 2) The attentive stage, which refers to processing that requires the deployment of attention to a particular stimulus or location.
Describe one phenomenon where timing is critical to visual attention.
One such phenomenon is known as the “attentional blink.” In this case, there is a difficulty in perceiving and responding to the second of two target stimuli amid a rapid stream of distracting stimuli if the observer has responded to the first target stimulus within 200 to 500 ms before the second stimulus is presented.
What is attentional selection, and why is it important?
Attentional selection is the ability to attend to specific properties of a display, which may require switching attention from previous properties without moving the eyes. This ability is important because it allows one to focus on the relevant information in a display rather than “getting lost” in the entire display. During attentional selection, different aspects of the display appear more prominent as one shifts attention to the property selected.
What do fMRI studies involving the fusiform face area demonstrate about attention?
These studies show that attentional selection can be used to perform one type of specialized processing rather than another. One study showed that the fusiform face area is especially important in the processing of faces and that the parahippocampal place area is especially important in the processing of places. If observers view an image of a face superimposed over an image of a house, the face area becomes more active when the observer is attending to the face, and the place area becomes more active when the observer is attending to the house.