Week 5 - Model based Vision Flashcards
What is model based vision
Visual understanding and interpretation are achieved through the use of explicit models of objects and scenes
Wha is Principle Component Analysis
Used for dimensionality reduction
Aims to find the directions (or principal components) that capture the maximum variance in the data
What is dimensionality reduction
(2D → 1D)
Given some data
Find a line of best fit v1 which contains the largest variation in values
If v2 (perpendicular) variation is much smaller compared to v1
We: project all points onto v1
approximately reduce the data by only saving v1 (each point is no represented by a single value)
Can also do: 3D → 2D, 4D → 3D
When does dimensionality reduction create a better representation
When the ellipse along v1 is narrower
What is the PCA Algorithm
1) Assemble data into matrix
2) Compute the covariance matrix
3) Find the Eigenvalues (λi) and Eigenvectors (vi) of C
4) Choose the K largest eigenvalues to account for p% of T
(because we want to reduce the number of dimensions)
For example, we might choose p = 0.95
What are the dimensions of the intial PCA matrix
number of samples x number of variables
row x column
What is an Eigenvalue (λi)
Represents the magnitude of variance along each direction
What is an Eigenvector (vi)
The directions of maximum variance in the dataset
How do you calculate total variance along a direction
Σ λi
What are the dimensions of the PCA covariance matrix
number of var x number of var
(symmetrical)
what is in each segment of a covariance matrix, given 2 var:x,y
cov(x,x) cov(x,y)
cov(y,x) cov(y,y)
What is cov(x,x) the same as
var(x)
How do you calculate cov(x,y)
cov(x,y) = E[xy] - E[x]E[y]
E[] = average
What equation to eigen values and vectors satisfy
Av = λv
To find the eigenvalues we solve
det( A - λI) = 0
where I is an identity matrix
what is the formula for determinant
determinant = ad - bc
What do the two eigenvalues represent
the magnitude of the variance in direction v1 and v2
(eigenvector v1 is perpendicular to eigenvector v2)
How do you calculate the proportion of variance in the v1 direction
eigenvalue(v1) / (eigenvalue(v1) + eigenvalue(v2))
eg 0.86
means if we projected the data points onto v1 we would still retain 86% of the variation in the data
What are Active Shape models
Allow for non-rigid shape matching
eg annotated dataset of female faces
Are able to generate different (valid) shapes based on supplied shapes in training dataset
What do active shape models not allow
shapes to be scaled, rotated, translated etc
How do we use PCA in ASMs
Assemble data in matrix (number of variables x number of data samples)
Compute covariance matrix (number of var x number of var)
Find eigenvalues solve det(C - λI) = 0
so there will be (number of var) eigenvalues
Which can be a lot so we choose the K largest eigenvalues
How do we then generate new shapes using eigenvalues
x = x̄ + Vb
Where x̄ is the mean shape
Eigenvectors vi are the column vectors in V
and b is the shape parameter vector
the mean shape is calculated from the original matrix of data D
What is the shape parameter vector b
Size (number of var x1)
A column vector of coefficients that scales the eigenvectors (or eigenmodes) of shape variation
What are the modes of variation of a shape
The effect of changing b on the shape
PCA finds these automatically
eg b1 changes the vertical position
How do we use shape fitting to find a shape in a new image
We can use an iterative localised search in the image:
- Place the shape into the image
- Search in the neighbourhood of current feature points for better locations
- Fit the model to the new suggested shape
- Repeat until convergence occurs
What parameters do we need to be able to change when placing a model in an image
(s, θ, r, b)
As well as changing the values in b, we also need to allow for the “pose parameters”:
scaling (s), rotation (θ) and translation (r)
What does b=0 mean about the shape
It is the mean shape
How do we fit a model to the image
- Calculate normals to the model curve at each point (current points are x)
- search along the normal for the strongest edge (trying to find the edge of the actual shape)
- This gives a set of suggested points x’
- But we do not just use as x’ as this may not give a valid face shape
- First, (s, θ, r) are found to best fit x to x’
- rigid transformation to get closer, gives x’’
Use the model backwards - b = V−1 (x’’ − ¯x)
- ¯x is the mean face
- rigid transformation to get closer, gives x’’
- New b generates a new valid shape
- Iterate until convergence
How is using the model backwards useful
originally: x = x̄ + Vb
When we calculate x’’ using rigid transformations and normals
we can calculate b using:
b = V−1 (x’’ − ¯x)
And then find a new set of suggested points using
x̄ + Vb (with updated b) to get closer to actual shape
How is ASM used in the real world
Medical application: ASM are used a lot in the medical field
eg positioning an artificial hip
What is the Active Appearance Model
- face is split into triangles
- texture within triangles is warped to fit the model
- imagine face printed on a rubber sheet and stretched about → used to change expressions
How are different shapes creates in ASM
By changing b