Week 4 - Segmentation and Clustering p3,4 Flashcards
What is model- free clustering
“non-parametric clustering”
clustering techniques that do not assume a specific parametric form for the underlying data distribution
What is mean-shift clustering
An advanced technique for clustering based segmentation
“model -free” does not make assumptions on data shape or number of clusters
Essentially smooths a distribution and finds its peaks
What is a mode
Where the density reaches a peak (local maximum) - highest frequency
easy to see, hard to compute
How does iterative mode search work
- Initialise random seed (random number generation), and window W
- Calculate centre of gravity (the “mean”) of W:
- Σ x H(x)
- Shift the search window to the “mean”
- Repeat Step 2 until convergence
Can use the derivative f’(x) of the kernel density estimate f(x) to find the direction we have to move it to find the local maximum
Mean shift visualisation using dot clusters
1) choose random point (not necessarily dot) and region of interest around it
2) calculate centre of mass for this region
3) calculate mean shift vector pointing us towards this centre of mass
4) shift along vector and repeat
We repeat from every single point which is very laborious
How do we make mean shift more efficient
Record the trajectories of all regions that move to the same mode (attraction basin)
We can then cluster all data points in the same attraction basin of a mode
What is an attraction basin
attraction basin of an attractor is the region of the state space from which trajectories lead to that attractor
Pros of mean shift clustering
model-free
A single parameter
potentially finds global maxima
finds multiple of modes
robust to outliers
Cons of mean shift clustering
output heavily depends on window size
(large window-> fewer and large clusters, small window -> more and smaller clusters)
window size selection is not trivial
computationally expensive
does not scale well with dimension of feature space
What is Graph theoretic segmentation
The approach in which an image is represented as a graph
- Node for every pixel
- Edge between every pair of pixels (p,q)
- Affinity weight (cost) for each edge wpq
How does segmenting via graphs work
Delete links that cross between segments
Easiest to break links that have low similarity (low weight)
- Similar pixels should be in the same segments
- Dissimilar pixels should be in different segments
How is a weighted graph represented by a square matrix
Row and column for each vertex
ij th element of a matrix = the weight of the edge from the vertex i to vertex j
The diagonal entries represent the edge from the node to itself (loop)
What sort of matrix is used for an undirected graph
A symmetric matrix
Half the weight in each of the ijth and jith elements
How can we measure affinity
denoted: aff(x,y)
- Distance
- Intensity I(x) - I(y)
- Colour
We place the similarity weight calculated on each edge of the graph
What is affinity
similarity
What is ideal for the weight between similar noes
Large
(different nodes = small)
How does scale affect affinity
small σ: group only nearby points
large σ, broader gaussian: group far-away points
What is a graph cut
The set of edges whose removal makes a graph disconnected
What is the cost of a graph cut
Sum of weights of cut edges
Σwpq
What is a minimum cut
The least costly cut
Not always the best cut
It is bias to cutting off the lesser weight (smaller, isolated sections)
What is the best cut?
minimises the number of edges crossing between the partitions
What is the normalise cut (NCut)
The goal is to penalise large segments (as minimum cuts favour small cuts -> large segments)
What is the normalised cut equation
NCut(A,B) = cut(A,B)/ assoc(A,V) + cut(A,B) /assoc(B,V)
What is Assoc(A,A)
Sum of all weights (association) within a cluster
What is Assoc(A,V)
assoc(A,A) + cut(A,B) is the sum of all weights associated with nodes in A
What assoc do big segments have
Large assoc(A,V)
Because it will have many edges with weights
Thus decreasing Ncut(A,B)
How does NCut reduce bias to smaller cuts
Want smaller NCut
By reducing big segments NCut, we have more chance of choosing them over small
What can be said about finding the globally optimal cut
NP-complete
a relaxed version can be solved using a generalised eigenvalue problem
Pros of normalised cut
Generic framework, flexible to choice of affinity function
“model free”
cons of normalised cut
Time and memory complexity can be high EG dense graphs → many affinity computations
What is a GrabCut
Segmentation method that takes user input into account
User input = red box around the desired cut
How do we evaluate segmentation
f1 score
How do we calculate F1 Score
F = 2PR / (P+R)
How does grabcut work
Immediately treats all pixels in the red box as foreground and then recursively improves
Iterates between 2 steps:
- segmentation using graph cuts
- decides if a pixel is fore or background (binary)
- (requires having foreground model)
- Once we have this segmentation
- Uses unsupervised clustering to learn two different models for foreground vs background
- sends this back to segmentation for improvement
Has very successful results
How do we improve efficiency of segmentation
Superpixels
Group together similar looking pixels
cheap, local over-segmentation
What do we need to ensure about chosen superpixels
- Do not cross boundaries
- have similar size
- have regular shape
- algorithm has low complexity
What is precision P
percentage of marked boundary points that are real ones
What is recall R
percentage of real boundary points that were marked