Unstructured Categorisation Flashcards
similarity and theories of mental representation
spatial models
feature models
structured models
-analogy
spatial models of similariy
claim: we represent stuff in a mental space. Distance is a function of similarity.
multi-dimensional scaling of animals: pairwise similarities of all items, create a space that respects all similarity relations
reaction time to confirm goose and eagle are in the same category correlated with the distance of each to bird
latent-semantic analysis
giant matrix: columns are entries in encyclopedia, rows are every word that appears in encyclopedia
every cell in matrix gets 0/1 depending on whether that word appears in the entry or not
every word is represented by a 10000 place vector of 0 and 1s
similarity of two words is conceptually the correlation of the two vector sets of 0s and 1s
does well on ESL synonyms test, behind computational language processing
violation of spatial models (Tversky)
- in space,distance from A to B = B to A, but similarity is asymmetrical: e.g., “Canada is like USA” vs. “USA is like Canada”
- Metric spaces show “Triangle inequality”, i.e., the distance between A & C cannot be greater than the sum of the distance between A & B, and B & C
- Similarity violates this axiom: USSR and Jamaica are more dissimilar than would be expected when comparing USSR to Cuba, and Cuba to Jamaica - Similarity and difference should be metrical inverse, but
East vs. West Germany both more similar and more different from Sri Lanka & Nepal.
feature models of similarity
Similarity of A and B is the sum of features common to A and B, minus the features A has that B does not, minus the features that B has that A does not
Features,as in, just list features of things to compare: e.g., USSR and Cuba
Each set can be weighed as more or less important according to context (q,a,b).
e.g.,similarity judgments highlight common features, while difference judgments highlight distinct features
– East and West Germany would have both more common features listed and distinct features than Sri Lanka & Nepal
category vs concepts
category: sets of things in the world that we represent as alike in some way, or treat as equivalent for some purpose.
concept: the representation of a category
how are categories represented in spatial and feature-based approaches
assume categories are represented by unstructured collections of features, describing the properties of individual objects
– But also, don’t make meaningful distinctions with spatial representations
why “unstructured”: just a big bag of features, dog: four-legged, furry, bark
but not coherence as to why four-legged, furry, bark go together
Many variants on this theme: • Classical rule-based view • Prototype models • Exemplar models • Cluster models • Category boundaries
classical rule-based view
categories are represented by a set of defining necessary and sufficient features distinguishing members from non-members
e.g. bachelor: unmarried man
people learn categories by holding candidate rules in mind and test different rule’s ability to predict membership
implies binary: either in the category or not
criticism of classical view
- There are often not necessary and sufficient conditions
– e.g. for bachelor: pope, widower, man in monogamous long-term non legally binding relationship - Wittgenstein’s example of game; no defining rules, categories have a family resemblance structure where examples share some features with other examples, but no single feature is common to every example
prototype theory (Rosch and mervis)
Rosch: prototypes are the collection of the average (mean or mode) features across examples (central members of the category)
– Graded category membership: how similar is any given example to the prototype? e.g., robin vs. penguin for bird
– Classification is not just testing rules, but seeing how similar a new exemplar is to the prototype
experiment with natural categories for prototype theory
Subjects were given examples of categories and rated their typicality
– “how typical is the exemplar of the category?” or “how representative…”
– e.g., robin, eagle for bird; gun, sword, axe for weapon etc.
Other subjects listed properties of exemplars of categories and contrast categories
– e.g., fruits (kiwi, orange) vs. vegetables
The more features any given exemplar had in common with other exemplars, and the fewer with exemplars of contrast categories, the higher the typicality rating
• Typicality is a function of overall “cue validity”
experiment with artificial categories for prototype theory
Subjects learned to classify letter strings as members of two categories
– 6 letter strings per category – e.g., HDFTG, GYHJL
Exemplars of categories had variable number of letters in common with other members and with members of the other category
More features in common with other members and fewer non-members predicted the number of trials to learn, speed of classification, typicality ratings after learning
– That is, higher cue validity, better learning, etc.
people could learn categories with family resemblance structures that have overlaps in certain features but no one feature was defining
Ponser & Keele: abstraction of prototype
Categorise dot-patterns distorted from prototype
During learning, subjects never see actual prototype
After learning, just as fast/accurate or faster and more accurate to classify prototype than many seen exemplars.
abstract underlying commonality even if never saw, classify based on how close to that abstraction
exemplar theory
Agrees with prototype theory in main advances beyond the classical view
– graded membership, classification about similarity not rules
Challenges that abstractions are ever made. Categories are represented as the collections of encoded exemplars.
– or partially encoded exemplars, based on attention
i.e. every time you classify something, it’s not based on similarity to prototype, but based on all previously classified examples
Nosolfky & Shin: re-explain unseen-prototype advantage
Posner & Keele (1968) unseen-prototype advantage re-explained as summed similarity to all exemplars
– And can explain that experienced distortions classified more easily than equally similar to the prototype non-experienced distortions (Shin & Nosofsky, 1992)
e.g. ? 1 and 2 have same distance from prototype, but better at classifying the one closer with classified examples
Assuming equal similarity to prototype: experienced exemplars classified easier than novel; and novel near other experienced exemplars easier than far away
Novel atypical example: does emu help you with ostrich?
knowlton & squire (1993) study categorization vs recognition
learning mechanism of category learning fundamentally different neural system than memory
control vs amnesia, classification and recognition tasks
amnesias: can tell if it’s similar to learnt category, but do much worse to answer “is this the same one you’ve seen before”
bad at remembering exemplar, but good at extracting prototype
nosofsky: what if all you have is just exemplars in both categorisation and recognition, but used differently?
how similar to past experiences does it need to be?
amnesia leads to less sensitivity towards similarity
summary to exemplar theory
People generalize to things that are superficially quite similar to what they have previously experienced
Doctors’ diagnoses of skin disorders are facilitated when they are similar to previously presented cases, even when the similarity is based on attributes that are known
to be irrelevant for the diagnosis (Brooks, Norman, & Allen, 1991)
Why would a system be designed in such a way? Why not just store what is useful?
But how would one know what will be useful later? Storing as much as possible allows for a greater variety of information to be used if it turns out the be important.
cluster models
Can make abstractions, can store exemplars
– Based on task and feedback
For example:
As exemplars are encoded, system predicts their category membership
Forms prototype-esque summary representations of highly similar exemplars that all lead to the same accurate classification.
– Keeps on just seeing bears while learning about mammals
– A series of small metal spoons..
Category boundaries
As opposed to being concerned with the summary representation of the middle of the category, some theories focus on the importance of the borders between categories
– e.g., Ashby & Maddox (2005)
– Not mutually exclusive, e.g., Love et al., (2004).
Many categories are represented in opposition/contrast to each other, and so the border is highlighted
– e.g., conservative vs. liberal; fruit vs. vegetable
Idealized members/ “caricatures” are seen as critical as they are like prototypes, but exaggerated features away from category boundary
– Predicted by error-based learning mechanisms
– Same error-based learning that can lead to new cluster recruitment (or at least in Love et al., 2004).
Davis & Love (2010) study
learned 4 categories of energy/political leaders, all 4 different on 2 dimensions
while learning, any given trial only chose between one of two categories
exemplars represented as values along 2 dimensions, choice between 2 categories contrasted a single dimension
subjects asked to show average value for each dimension for each cateogory
finding:
on the dimension of contrast, the average value idealized away from category boundary
on the dimension not contrasted, average value was accurate
point: category learning distorts our understanding/memory
summary of feature-based models of categorisation
Classical view: categories are represented by necessary and sufficient conditions; people learn categories via the testing of hypotheses of category-defining rules
Prototypes: family resemblance structures, graded membership, form abstract representations of category average (either mean or mode); classification via similarity to prototype
Exemplar: no abstraction; classification via summed similarity from all exemplars, or similarity to individual exemplars
Clusters: concepts composed of multiple clusters picking out smaller-order generalizations or even exemplars
Boundaries: focus on the dividing lines between categories
– Leads to ideals/caricatures, also predicted by the error-driven learning of cluster models
Markman & Ross: Category Use and Category Learning
After classification or inference training: classify exemplars being either a single feature or all features
Prediction: inference helps learn prototypical value for all features, and so will lead to superior single feature classification
classification only learns part of the puzzle that’s useful, some specific feature that defines category, whereas inference need to learn internal relationship between parts and all features