Instance Based Learning Flashcards
What is the assumption of Inductive Bias in k-Nearest Neighbours?
Class of instance Xq is the most similar to the class of other instances that are “nearby”
What are the components of an IBL algorithm?
Distance function
Classification method
Memory updating
What is the distance function?
It returns a measure of the distance between 2 instances
What is the classification method component?
1-nearest neighbour (1-NN): the new instance (from test set) is assigned the class of its nearest neighbour (from the training set)
K-nearest neighbour (K-NN): the new instance is assigned the majority class in the K nearest neighbours (K is normally set to 3, 5, etc)
What is the memory updating component?
Simplest approach: saves all training examples (but takes lots of memory and processing time)
“intelligent” methods select the most relevant instances
What is the motivation for attribute weighting?
Different attributes have different degrees of relevance
What are the advantages of IBL?
Simplicity
Suitable for complex problems, with strong attribute interaction
Incremental (new data can be added immediately, don’t need to re-build the model)
What are disadvantages of IBL?
Classification of test instances is slow
In the context of data mining, what do K-NNs return for IBL?
They return a very specific explanation for the classification. I.e., the nearest neighbour(s) but not a generalised explanation (rule)