Lec 4/ TB Ch 4 Flashcards

1
Q
  • Idealism vs materialism
  • Reality
  • Ventral stream areas
    • Lateral occipital complex fx
      • Bilateral damage →?
    • Parahippocampal Place Area fx
    • FFA fx
    • Extrastriate Body Area (EBA) fx
    • Dorsal stream fx
      • location
      • Characteristics (2)
  • ventral stream fx
  • Superior temporal sulcus pathway fx
  • 3 things that happen b/w the streams
A
  • Idealism: Reality is inseparable from perception; reality is a mental construct
  • Materialism: The mind can be explained in terms of matter and physical phenomena.
  • In reality, it is both

Some ventral stream areas

  • Lateral occipital complex (LOC): object perception
    • Bilateral damage to LOC
    • See colors, but can’t see objects
  • Parahippocampal Place Area (PPA): Scenery, locations
  • Fusiform Face Area (FFA): Faces
    • Extrastriate face Area: activated by sides of faces
  • Extrastriate Body Area (EBA): Bodies
  • X
  • Dorsal stream: processes where, how information.
    • Projects into parietal cortex
    • Fast but colorblind
  • Ventral stream: processes what information
  • New streram:
  • Superior temporal sulcus pathway (red): for biological motion and social perception
    • (IOW body language, social non-verbal signals)
  • These streams are broad generalizations
    • W/in the streams, there are cross connections, feedback, feed forward b/w
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q
  • object perception
  • object recognition
  • object identification
  • object naming
  • 3 things v1 recognizes
  • 3 challenges
    *
A
  • Object perception: Middle vision combines features into objects; result is object perception. (ex. It’s an object)
  • Object recognition: we match a perceived object representation to a representation encoded in memory(ex. it’s a house)
  • These memory traces can contain information about object categories or information about that particular object. That’s object identification.
    • Object identification: you recognize the same specific object in a different angle (ex. the same house)
  • Object naming: Recognized objects hv semantic labels and names
  • V1 recognizes lines, edges, and gratings of a specific orientation
  • Challenges:
    • Not all lines are straight; there are curved lines
    • Gaps/occlusion
    • Overlap
  • L: reality
  • Mid: What V1 detects
  • → Gestalt contour rules
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q
  • Wundt/ Structuralist view
  • Gestalt rx
  • Rules for linking contouns
    • #1 law of “good continuation”
      • fx
      • nature scene & this law
      • Kanizsa figures
        • aka
      • Murray et al. (2002) - Kanizsa figures vs inverted pac mans
        • Method
        • which brain areas is activated?
        • When does illusory contours happen
        • When does early visual processing?happen?
A

Rules for linking contours

  • Wundt/ Structuralist: thinks we can piece things together to form objects
  • Scholars reaction to structuralist → Gestalt
  • Main motto:”The whole is greater than the sum of its parts”
  • Gestalt laws (grouping rules): set of rules describing which elements in an image will appear to group together
  • X
  • #1 law of “good continuation”: two elements tend to group together if they seem to lie on the same smooth contour
    • Good continuation rule helps us fill in gaps in contours (A sudden stop in an edge)
  • Geisler et al. (2001): use natural scene statistics to explain the gestalt law of good continuation.
    • IOW: When we look at the top branch, the line belongs together in the real world; so, when things are together, we perceive them as smooth
  • Illusory contours/ Kanizsa figures
    • We see there are several ends/end stoppings; we perceive that there is an occluding contour (i.e. an arrow blocking the circles and lines)
  • Murray et al. (2002):
    • Showed Kanizsa figures and the same figure w/ the pacmans facing opposite directions (IOW: no illusory contours)
    • EEG ppl’s heads
    • After 128 ms post onset, lateral occipital complex is activated
      • IOW: processing illusory contours happen at late (higher level) visual processes
      • (Early visual processing happens at 50 ms post stimuli onset)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q
  • Texture segmentation
    • Zebra example
    • aka
    • 2 reasons why it is not stable
  • Gestalt law of “proximity”
  • Parallelism/Symmetry
A

Texture segmentation

  • Texture segmentation: carving (parsing) an image into regions of common texture properties
    • Ex. You see zebra stripes -> common texture -> one object
    • Ex. You see grass -> common texture -> another object
    • AKA Gestalt law of similarity: elements groups together if they are similar
  • Issue: Doesn’t work well all the time; if texture segmentation is programmed in the computer, it is not stable and varies due to image quality (ex. luminance similar → seen as 1 thing)
  • Camouflage: attempt to trick texture segmentation
      • X
  • Gestalt law of “proximity”: two elements group together if their close together
    • Ex. you see rows
    • Ex: LS = red diamonds; RS: green diamonds
      • Squares have common color and shape -> difficult to see
      • Conjunction of features -> feature laws don’t work

Parallelism/Symmetry

  • Somewhat weaker grouping principles—group parallel (2,3) and symmetric elements together (7,8)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q
  • Gestalt law of common fate
  • Gestalt law of synchrony
  • Modern Gestalt law of proximity → 2 types
A

Dynamic Grouping Principles

  • Gestalt law of common fate: groups together elements that are moving in the same direction.
    • Ex. lamp
    • When we put the lamp outline in a bunch of lines, it’s difficult to see
    • If prof switch b/w the 2 images, we can see the lamp moving back and forth in the same direction (37)
  • Gestalt law of synchrony: group elements changing at the same time together
    • The circled 4 dots change color at the same time -> grouped together
    • .

Modern Gestaltism

  • # 1: Gestalt law of proximity
  • Common region: Elements perceived to be part of a larger region group together (#2)
  • Connectedness: Elements that are connected to each other group together (#3)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q
  • Gestalt laws → how do they work? (separated?)
  • Perceptual committee models
    • Middle vision
  • Selfridgge’s pandemonium model/ perception committee model
    • 5 steps
    • What do demons represent?
    • What do each level represent?
    • What do low level demons represent?
A

Pandemonium

  • All of these Gestalt laws work in //; Parallel processing
  • Perceptual committee models:
    • middle vision: “specialists” for certain features (feature values) vote on their opinions
    • Ex. Gestalt law of proximity competes with Gestalt law of common region
  • pandemonium model/ perception committee model (Selfridge, 1959)
    • Letter recognition
      • 1 Ex. Person sees “A”
      • 2 “- / \” features demons are excited the most
      • 3 -> “A” Cog demon is most excited (X and H are less excited)
      • 4 -> Decision demon sees A is most excited
      • 5 -> we are seeing A
        • “Demons” loosely represent (sets of) neurons; each level = different brain area
      • Ex. Low lv = feature demons (ex. simple cells)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q
  • When there is object ambiguity, what rules do perception committee follow? (3 rules)
    • Ex. dented/ bulging circles & physics/stats
    • Ex. smack table
  • How to Resolve ambiguity
    • for (Necker cube)?
    • What are accidental viewpoints
    • Why does our perceptual system Reject accidental viewpoints
A

Object Ambiguity

  • perception committee rules follow the laws of physics & statistics & bio
    • Ex. We perceive some circles are bulging, some are dented
    • This is b/c We know that light usually comes from above
    • Ex. Biology
      • Smacking table analogy (we don’t see the hand pass thru table; our mind sees the hand moves around the table)
  • Resolve ambiguity (Necker cube)
    • Ambiguity is not resolved in Necker’s cube
    • Solution: put a bar through the box -> solves ambiguity
  • Reject accidental viewpoints
    • The 3 objects cast the same cube like shadow
    • Our perceptual system will only perceive it as a cube; not irregular crystals
    • This is b/c our perceptual system rejects accidental viewpoints
      • IOW: cubes are more common than irregular crystals in our world
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q
  • Figure-ground assignment
  • Gestalt figure–ground assignment principles (4 principles)
  • New principle: Extremal edges
    • What is Extreme edge?
    • Ex. earth
    • Ex. bagel vs dark dot
  • How does Rubin figure show that Object recognition starts before figure–ground assignment finishes?
  • Heuristics definition
    • Relatibility - define
      • Ex elbow vs S curve
    • 3 main heuristics that provide cues for depth/occlusion
A

Figure–Ground Segmentation and Occlusions

  • Figure-ground assignment: determines that some image region belongs to an object in the foreground, other regions are part of the background.
      • Gestalt principles that influences figure-ground assignment
    • Gestalt figure–ground assignment principles: surroundedness, size, symmetry, parallelism
      • Surroundedness: green is surrounded by blue; green = object
      • Size: smaller things = objects
      • Symmetry: we see a green staff on LS; object w/ notches are unlikely
      • //ism: = object (ex. green road in the centre)
          • Extremal edges: (NEW) horizons of self-occlusion on smooth convex surfaces
    • When we look at planet Earth, we see North America; Asia is occluded by planet
    • Extreme edge = Boundary we stop seeing the other side
    • This is a powerful figure-ground cue
    • Ex. Grey shading tells you the light part is the bagel/object; dark part is the background
    • Ex. When the shedding is removed, it seems like the black circle is the object
      • X
  • Object recognition starts before figure–ground assignment finishes!
    • Rubin figure: Is brown or white the figure here?
      • We see both, the vase and faces
      • IOW: the yellow and white can both be foregrounds
  • X
  • Heuristics for partially occluded features
  • Heuristics are mental shortcuts that work most of the time but not always.
    • IOW: Gestalt laws are heuristics b/c the work most of the time
  • We complete edges behind block when the edges are relatable by an “elbow curve” (Kellman & Shipley, 1991)
  • It is unlikely to have and S curve behind the block -> unrelatable
  • Relatability: degree to which two line segments appear to be part of the same contour.
  • X
  • heuristics that serve as cues for depth & occlusion:
    • Non-accidental features provide clues to object structure. Don’t depend on exact viewing position.
    • T junction = occlusion
    • Arrow/Y junctions = edge; not occlusion

Parts and Wholes: the Forest and the Trees

  • Global superiority effect (Navon, 1977): Properties of a whole object takes precedence over parts of the object.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q
  • Naïve Template theory
    • Lock and key representations
  • Structural Description Theory
    • 3D/ CAD model
  • Marr & Nishihara’s model : based on cylinders
    • 7 steps
  • Biederman (1987): Recognition-By-Components (RBC)
    • Geons
    • How do they work?
  • Cubism in art
A

High Level Vision – Object Recognition

Naïve Template theory

  • Design a machines that recognizes “A”
  • Design 1: “Lock-and-key” representations
    • The “A” perfectly fits the template in the visual field
    • Problem: You would need too many templates! (ex. cursive vs block; capital vs lower cause; 3D object in diff angles)

Structural Description Theory

  • Solution 1: CAD-like 3D (“view-invariant”) models (aka 3D model)
    • Represent the structure of objects, not single views
  • Marr & Nishihara (1978): based on cylinders
    • 1 sees image of a man
    • 2 Early processing: figure ground; segregation; edge extraction
    • 3 Part segmentation: arms, body, head, legs
    • 4 axis estimation: axis for each part
    • 5 volumetric modelling: add cylinders to each axis
    • 6 3D object centered structure
    • 7 Compare existing 3D models in your memory (ex. man, dog, duck) to the 3D model of what you currently see in the real world
  • Biederman (1987): Recognition-By-Components (RBC)
    • Geons: generalized cylinders where the cross section can vary over the length of the axis, which itself might not straight
      • IOW: instead of having cylinders, we can have blocks, curves, cones, etc
      • Ex Geon 2 + 5 = suitcase
  • Did Picasso inspire structural description theory & RBC?
    • Cubism: Picasso, Braque and others
    • Attempt to depict the visual world with basic shapes
    • IOW: these artists inspired the structural description theory and RBC
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q
  • Issues w/ Geons/cylinders
    • 3 main reasons
    • Major argument: viewpoint invariance argument
      • View independence & Geons
      • Reality?
  • Viewpoint dependence evidence
    • View point dependence for faces
  • view-dependent models
    • Model 1: Tarr, Buelthoff et al → interpolating b/w views
    • Buelthoff’s student - computer program & interpolating side profile face
      • Example: when we see a new car
        • 3 stages
      • Evidence: Tsunoda et al 2001 - monkeys and the fire extinguisher
        • methods: 5 steps
        • Results
        • Location of activation
A

The Effect of Viewpoint

  • Issues w/ Geons/cylinders
    • 1 They are difficult to extract from real images
    • It is difficult to create geons for
      • 2 subordinate level recognition (ex. peter vs henry individuals)
      • 3 Natural objects that have complex structures (ex. a person tying shoe laces)
  • Most importantly: total viewpoint invariance argument
    • We can see the same object in different angles
    • RGN/Geons support the idea of view independence
    • View independence: we recognize the object equally well different viewing positions
    • That is not true; in reality, we recognize objects better when viewed at a familiar angle

View point dependence for faces

  • Ex. We can’t recognize the problem w/ the pics upside down as easily

View dependent models

  • Solution 2: view-dependent models
    • Tarr, Buelthoff and others assume that we store a small set of different views of the same image.
    • When we are viewing an object from an unfamiliar angle, we mentally rotate that object (interpolate between the views). As such, we can limit the number of views that have to be stored
    • In other words, these views are usually store as non-accidental features
    • To we may. This requires cues that are fairly robust to the vantage point (non-accidental features).
      • Buelthoff’s student
    • 1 Take pic of person from the side
    • 2 Computer program has to spit out what the person looks head on
      • The program is interpolating and used heuristics
  • E.g., object pieced together from individual parts
    • 1 Brain stored a library of learnt object parts
    • 2 You see a novel image of the car; so, you see which learn object part resembles the novel image (red Xs)
    • 3 You put the parts together to for the object
  • x
  • Tsunoda et al 2001
    • This is seen in monkeys
    • 1 Show monkey diff pictures (ex. fire extinguisher)
    • 2 Detect brain activity using Optical imaging = microscopic fMRI
    • 3 The red areas are active
    • 4 Cut image of fire extinguisher into parts: the hose, tube, and bottle
    • 5 Show these parts to monkey; only the corresponding areas right up (Ex. hose only area)
    • Results: Optical imaging shows that different parts of the object activate different feature columns neurons in IT (inferior temporal cortex)
      • These are the basic building blocks of recognition.
    • Perception/recognition work via distributed processes across many areas
      *
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q
  • Multiple recognition committees - definition
    • Ex. Sparrow vs ostrich → how we label it
    • hierarchies of categories; 3 levels
    • Which category/level are most objects named at?
    • What are the 2 exceptions?
  • How does “Multiple recognition committees” model challenge geons model? (ex fox sparrow vs robin)
    *
A

Multiple recognition committees

  • Perhaps we use more than one object recognition strategy
    • We see sparrow -> we say it’a a bird
    • We see ostrich -> we say it’s an ostrich (b/c it is an uncommon type of bird)
  • Object recognition often associates a percept with a category of objects.
  • Categories are discrete, hierarchically organized.
    • 3 levels
      • Superordinate: animal
      • Entry lv: bird, dog
      • Subordinate lv: fo sparrow, robin ostrich
  • Objects are usually named at entry level.
  • Exceptions
    • When we see an atypical category member -> we name it by the subordinate level
    • Experts tend to name things by subordinate level
  • X
  • This shows object recognition relies on different acts of recognition
    • E.g., recognition at subordinate level might be difficult to perform with Biederman’s geons/ RBC
      • Fox sparrow and robin will hv very similar geons
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q
  • 5 reasons why faces are special
  • Quiroga et al., 2005: Jennifer Aniston neuron
    • Method
    • Results
    • Counterargument from other rs
  • Gauthier et al. (1999) - FFA, greebles, and bird experts
    • Method: 4 steps
    • Conclusion on FFA
  • V1 maybe involved in face detection - evidence
    • Proof of feasibility: Viola-Jones filters
      • Viola-Jones filters fx
      • 2 types of Viola-Jones filters
      • Method (1 step)
        • Results
      • How are humans similar
      • How do V1 cells detect faces?
        *
A

Faces are special are they?

  • “Special” processes may be involved in identifying individual faces
  • Reason 1: evolutionary/learning arguments:
    • Evo: Friend vs foe
    • Learning: learn about new faces
  • Reason 2: cognitive argument
    • Object identification b/w Friend vs foe
  • Reason 3: Patients with prosopagnosia.
  • Reason 4: Special face area
  • Reason 5: Jennifer Aniston cell
  • Quiroga et al., 2005: Jennifer Aniston neuron
  • Showed monkey of different pics; respond most strongly to Aniston
  • IOW: there are specific cells for specific faces
  • Other rs argue that the cells are responsive to different face patches; not different faces
    • x
  • Counterargument
  • Gauthier et al. (1999):
    • 1 When ppl see a face; FFA lights up
    • 2 When ppl see birds and card; FFA doesn’t light up
    • 3 Ppl study “greebles” and their families
    • 4 After being trained on recognizing “greebles” subjects showed increased activity in the right FFA.
    • Conclusion: FFA is an area for visual expertise
      • Ex. bird and greeble experts -> FFA lights up when looking at birds/greebles
  • V1 maybe involved in face detection (Is there a face: Yes or No)
    • Proof of feasibility: Viola-Jones filters
      • Filters are used in our phones
      • The device frames what they think are faces
        • Viola-Jones filter 1: horizontal bar – dark bar (eyes) surrounded by 2 light bars (forehead, cheek)
        • Viola-Jones filter 2: vertical bar – light bar (nose) surrounded by 2 dark bars
      • 1 Viola-Jones: applied the filters to a film
        • The algorithm identified the hits
        • Had some false alarms
            • People make the same mistakes as the viola-jones filters
        • In area V1: cells are sensitive to horizontal and vertical lines
      • Early EEG signals, early V1 areas can be face sensitive (detects face)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q
  • Bottom up process: what happens?
  • 4 steps
  • as we down down the what stream, Size of receptive fields??
  • x
  • Top down processes evidence
    • rotating 2D mask
    • Object parts, context, and perceiving objects relationship
    • Explain B and C blob images
  • Cox et al. (2004): degraded faces activate FFA when presented in the right context.
    • Results for (FFA activity)
      • blurred face → FFA?
      • non-blurred face → FFA?
      • blurred face + context → FFA?
      • What makes FFA active when we see a blob?
        *
A

The bottom-up and top-down of perception

  • Increasingly complex stimuli drive neurons in different parts of the ‘What’ system.
    • Retina photoreceptors: responsive to dots of light
    • V1 simple cells: responsive to lines
    • V4: responsive to concentric gratings
    • Area TE in inferotemporal cortex: responsive to faces
  • This suggests there is bottom up process
  • Also, as we move down the what stream, there are larger receptive fields

Top-down processes

  • 1 Nose is convex overrides the stimuli -> rotating 2D mask -> you see nose is always protruding
  • 2 Object parts => whole objects <= context.
    • Object parts help you perceive the object; the context also helps you perceive the object
  • When you look at B -> it’s a face
  • When you look at C -> WTF?
  • When you give context for C -> you recognize the same blob as a face
      • Cox et al. (2004): degraded faces activate FFA when presented in the right context.
    • # 1 Blurred face does not activate FFA
    • # 5 non-blurred face -> activates FFA
    • # 4: blurred face + context -> activates FFA, even more strongly the non-blurred face (#5)
    • Top-down face perception can affect FFA activation even though we are seeing a blob
      • Top-down processes: Dalmatian, mouse face, convex/concave face mask, context & Top-down influence perception
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q
  • What is
    • p(S/I)
    • p(I/S)
    • p(S)
    • p(I)
    • p(I/S) x p(S)
    • Why is p(I) no so important?
    • This suggests that perception is governed by???
A
  • Bayesian inference
      • S = stimulus, reality
    • I = image, what is on your retina
    • p(SII): The probability of the given stimulus on the retina (image)
      • This is unknown
      • Ex. center pic
        • p (I/S): is known; the proability of an image, given a stimulus
      • But the reverse can be learnt thru experience
      • Ex. We know from experience that a soccer ball never casts shadows that are straight lines
    • p(S) is known
      • a-priori probability of stimuli (our prior knowledge that there are dogs)
      • IOW: How likely are certain stimuli in our world
      • (ex. see dogs vs aliens)
  • Bayes
    • Our image on the retina
      • What is p(S/I)?
        • We cannot directly measure it
    • Fortunately, we know (implicitly)
      • p (I/S): Library of known objects that can cast this image
          • p(S): all the objects you have seen in your life; and how probable are these stimuli
          • We can combine p (I/S) and p(S)
        • All possible things that can create the image -> each option has a probability
        • # 3 has the highest probability -> So this is most likely what I am seeing rn (99% I’m right) (your perception)
          • Summary
  • p(S|I): posterior probability
    • i.e., given what we see what is really out there in the world? Can’t know directly.
  • p(I|S): likelihood; given a certain scenario, how likely is X
    • Ex. given that there is a soccer ball, how likely will its projection be a Necker cube (VERY unlikely) etc.
  • p(S): a-priori probability of the things in the world
    • IOW: what we know exists & how likely.
  • p(I): a-priori probability of stimulation;
    • how likely that light falls on fovea vs periphery
    • Since we move constantly; the probability is equal
    • So, usually not so important…
  • Simplify things: p(S|I) ~ p(I|S)p(S)
  • Thus, the idea that perception can be described as Bayesian inference means that perception is governed by learned expectations.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q
  • Teufal et al 2015
    • Studied Early-stage psychosis patients vs. healthy control ppl
    • Method - used half tone images ( 3 steps)
    • Results
  • Controlled hallucination: an interaction b/w 3 things
  • Psychosis = ??
  • generative models
    • Model of the world
    • Sensations
    • When are we surprised?
A
  • Thus, the idea that perception can be described as Bayesian inference means that perception is governed by learned expectations.
  • These expectations might be more powerful in psychotic individuals (psychosis)
  • X
  • Teufal et al 2015
    • Studied Early-stage psychosis patients vs. healthy control ppl
    • 1 Before: “Which half-tone image shows a person?”
      • Half-tone image: only black or white; v difficult to see what is in the pic
    • 2 Presentation of colour images
    • (3) After: First test repeated (w/ half tone image)
    • Results: we do better in the “after task” due to perceptual priming
    • Patients improved more than controls!
      • [Perception can be considered a form of] controlled hallucination [that depends on the…] interaction between top-down, brain-based predictions and bottom-up sensory data.
    • What we see is an interaction b/w top-down processes, bottom-up sensory data, and out interpretation; it is a logical, controlled hallucination
  • [This is in contrast to a] hallucination as a kind of false perception.
    • Those w/ psychosis experience hallucinations that are false sense of reality
  • X
  • In sum… perception generates models/representations of the world to understand how the world creates our sensations.
  • This is called generative models; this integrates idealism and materialism
  • Model of the world: Idea of how the world works (Idealism)
    • P(S): representation of the objects that exist in the world
    • P(I|S): models how objects create images on the retina
  • Sensations: senses match w/ what your expectations (materialism)
    • When our expectations are surprised (there are aliens) -> surprise
      *
How well did you know this?
1
Not at all
2
3
4
5
Perfectly