Week 4 - Segmentation and Clustering p1,2 Flashcards
What is the goal of Grouping
Gather features that belong together for efficiency of further processing
Obtain an intermediate representation that compactly describes key parts
What is the Gestalt principle
That grouping is key to visual perception
Elements in a collection have properties that result from relationships
(negative space between features)
What are the gestalt factors
not grouped ( line of identical dots, spaced evenly)
Proximity
Similarity
Common fate (Elements that move in the same direction belong together)
Common region
Symmetry
Closure
Continuity
Parallelism
How do gestalt principles work with computer vision
It is difficult to know how best to apply these ideas via algorithms
What is Top-down segmentation
pixels belong together because they are from the same object
What is bottom-up segmentation
pixels belong together because they look ‘similar’
We define this similarity: colour, texture, gradient
How do we measure segmentation success
It is very hard to measure the success of segmentation; it depends on the application
Compare to human segmentation or to “ground truth”
- But often there is no objective definition segmentation
- Not even agreement between humans
What are superpixels
Compact and perceptually meaningful groups of pixels in an image
The idea is to not even try to compute a “correct” segmentation
Lets be content with over-segmentation (more superpixels less big regions)
- there will be the lines that follow the correct segmentation
How do we segment via clustering
Using a pixel Intensity histogram
Often we will get curves (peaks) at different respective grey levels (objects)
We use clustering to determine the main intensities
How do we use sum of squared differences in clustering
Σclusters i Σpoints p in cluster i ||p - ci||²
We want to minimise SSD between all points p and their nearest centre ci
What is the chicken and egg problem in clustering
Know the cluster centre → can allocate correct points to groups
know the group memberships → can get the centres of each cluster
What is k-means clustering
1) Pick random k cluster centres
2) determine group members using distance to centre
3) calculate average of each group and update ci
Will always converge to some local minimum but not necessarily the global minimum of SSD||p - ci||²
What is Feature Space
A very important factor
Depending on what we chose, we will group pixels in different ways
For example,
Intensity similarity in 1D (grayscale)
Or 3D: (r,g,b)
Or 5D: (r,g,b,x,y)
This encodes both similarity and proximity
What are the pros of K means
Simple, fast to compute
Converges to local minimum of within-cluster squared error
What are the cons of k means
Have to chose a k
Sensitive to initial centres
Sensitive to outliers
Detects spherical clusters only
What is a ‘hard assignment’
Assigning each data point to exactly one cluster/group
It is something we want to avoid
What is a soft assignment
Data points can have multiple memberships “fuzzy clustering”
Eg Gaussian mixture model using probability
What is the Gaussian Mixture model
Clusters are modelled as gaussians
not just by their mean (k-means)
Gives a generative probability model of x
Expectation-Maximisation (EM) algorithm assigns data to cluster with some soft probability
What is the Univariate Gaussian (normal) distribution
Describes the likelihood of observing a continuous random variable along a single dimension
Takes two parameters: μ^ and σ^² > 0
How do we calculate μ (MLE)
μ^ = 1/N Σ x^(i)
How do we calculate σ² (MLE)
σ^² = 1/N Σ (x^(i) - μ^)²
What is the Multivariate normal distribution
Describes the joint distribution of multiple random variables along multiple dimensions
How do we calculate μ^ (multivariate)
μ^ = 1/m Σ x^(j)
How do we calculate Σ^ (multivariate)
Σ^ = 1/m Σ (xj - μ^)^T(xj - μ^)
T = transpose
m = number of data points
xj = j-th data point (represented as a column vector)
essentially multiplies two difference vectors and gets the square covariance matrix
What are the three forms of covariance
Spherical, diagonal, full
What is Spherical covariance
[σ² 0
0 σ²]
All variables have the same variance σ²
All have equal importance
visually, a sphere
What is diagonal variance
[σ1² 0
0 σ2²]
Each variable has its own variance
Unique variance but no dependencies between variables
Visually: a vertical or horizontal ellipse
What is full variance
[σ11² σ12²
σ21² σ22²]
Each variable has its own variance and there can be covariances between different variables (dependent)
What is covariance
A measure of how two random variables change together
What is a generative model
Generates new data points via a probabilistic distribution
defined by a vector of parameters θ
What is Mixture of Gaussians (MoG)
A generative model that represents the data as a weighted sum of multiple Gaussian distributions
How does MoG work
1)Start with parameters describing each cluster:
- mean μ, variance σ and size π
- each cluster has the gaussian probability distribution
2)Each gaussian component has probability πc
For instance, if π1=0.6 , component 1 is selected 60% of the time
3)Then we generate a datapoint x using the mean and covaraicne of that gaussian component
What is a blob
Component in an image that exhibits some common characteristicent
mean μb, covariance matrices Vb, dimension d
Each blob has a probability (weight) αb
What is the likelihood of observing x in a MoG
A weighted mixture of gaussian blobs
P(x|θ) = KΣ αbP(x | θb)
Here x is our soft assignment
EM for blob
Find blob with paramaters θ that maximise the likelihood function
P(data | θ) = Π P(x|θ)
θ = mean and covariance
Probability of observing the whole dataset given the parameters = weighted probability of seeing x given parameters
What are the steps of EM
1) Expectation step
Use the current parameter guess θ
Compute probability that point x is in blob b
2) Maximisation step
Use the computed expected values
The parameters of each blob (mean and covariance) are updated based on E-step
Update the blobs to maximise the likelihood function
3)Repeat until convergence - parameters are no longer significantly changing
How does EM compare to k-means
very similar but EM is using soft assignments
What are the applications of EM
Clustering
Segmentation
Model estimation
Missing data problems
finding outliers
What are the pros of MoG
Probabilistic interpretation
Soft assignments between data points and clusters
Generative model, can predict novel data points
Relatively compact storage (mean and shape)
What are the cons of MoG
Local minima
Initialisation affects the result - good idea to start with k means
Need to know number of components - has been figured out in the ML field
Need to choose generative model
Numerical problems are often a nuisance
Gestalt: whole or group
Whole is greater than sum of its parts
Relationships among parts can yield new properties/features
What 2 parameters does the multivariate guassian distribution take
- A vector containing mean position, μ^
- A symmetrical “positive definite” covariance matrix Σ^