Lecture 6: Object Recognition & Line Extraction Flashcards
Concerning the Bag of Words(BoW) approach, what are the 2 questions asked?
Is this image in my database?
Robot: Have I been to this place before?
Concerning the Bag of Words(BoW) approach, what 3 analogies can be used from text retrieval?
Visual words, Vocabulary of Visual Words, “Bag of Words”(BoW) approach
What is the succession order in image retrieval?
Image collection –>Extract features –>Cluster Descriptors
What is image retrieval in BoW?
represents a more general problem of object or place recognition
What can we do with Image Retrieval?
describe a scene as a collection of words and look up in the database for image with a similar collection of words
What if we need to find an object/scene in a database of millions of images?(2 things)
- Build Vocabulary Tree via hierarchical clustering
- Use the Inverted File system: a way of efficient indexing
What is the format of the Inverted File System?
Each node in the tree is associated with a list of images containing an instance of this node.
Concerning Vocabulary trees, what is K-means clustering? What is minimized?
partitioning a point cloud into k clusters, such that each point belongs to one cluster.
the Sum of Squared Euclidean Distances between points and their nearest cluster-centers
What is the algorithm for Vocabulary trees?
- Randomly initialize k cluster centers
Until Convergence, what 2 things do you do in the algorithm for Vocabulary?
-Assign each data-point to its nearest cluster-center
-Re-compute each cluster-center as the means of all points assigned to each cluster
In Vocabulary tree construction, what does each node represent?
a cluster of descriptors
In Vocabulary tree construction, what does each leaf represent?
a visual word
What does an inverted file DB list? Inside it, what does each word point to?
all possible visual words.
a list of images where this word occurs.
What does a Voting array have?
as many cells as the images in the DB - each word in query image votes for an image
what is tf-idf acronym for?
Term
Frequency
-Inverse
Document
what does tf-idf measure?
the importance of a visual word inside a document(as part of a document DB)
what is term frequency?
frequency of word in image
why should tf-idf of word in image j ?
use it to weigh the importance of each word when voting for corresponding image
What are the 3 main problems in line extraction from a point cloud?
How many lines are there?
(Which points belong to which line?)Segmentation
(Given points that belong to a line, how to estimate the line parameters?) Line Fitting/Extraction
What are 4 algorithms used in line extraction from a point cloud?
(S-a-M, LR, 5, H-T)
Split-and-merge,
Linear regression,
RANSAC,
Hough-Transform
When it comes to split-and-merge, how does it work?
Iterative end-point-fit: simply connects the end points for line fitting
When it comes to line-regression, how does it work?
“Sliding window” of size N points; fit line-segment to all points in each window; merge overlapping line segments + re-compute line parameters for each segment.
What does RANSAC stand for?
RANSAC = RANdom SAmple Consensus
What is RANSAC?
A generic & robust fitting algorithm of models in the presence of outliers (i.e., points which do not satisfy a model)
Where can RANSAC be applied?
in general to any problems, where the goal is to identify the inliers which satisfy a predefined model
What are the 6+ typical RANSAC applications?
line extraction from 2D range data,
plane extraction from 3D data,
feature matching,
structure from motion,
camera calibration,
homograph estimation, etc
How does RANSAC work?(2 adjectives)
is iterative and non-deterministic –> the probability to find a set free of outliers increases as more iterations are used.
What is the drawback of RANSAC?
a non-deterministic method, results are different between runs
What are the 5 steps of RANSAC?
(S.C.C.S.R)
- Select sample of 2 points at random
- Calculate model parameters that fit the data in the sample
- Calculate error function for each data point
- Select data that supports current hypothesis
- Repeat sampling
Set with the maximum number of inliers obtained after k iterations
How many iterations does RANSAC need?
k = #
* Ideally: check all possible combinations of 2 points in a dataset of N points.
* Number of all pairwise combinations: ?
* Computationally infeasible if N is too large.
* Example: 10,000 points to fit a line through –> need to check all
10,000* 9,999/2 = 50 million combinations!
In RANSAC, do we really need to check all combinations, or can we stop after some iterations?
- Checking a subset of combinations is enough if we have a rough estimate of the percentage of inliers in our dataset.
In Hough-Transform, what do points vote for?
plausible line parameters
What does Hough-Transform do?
maps image-space into Hough-space
What is Hough-space?
voting accumulator, parameterized w.r.t, line characteristics
What does a line in image space correspond to in Hough space?
a point
What does a point in image space correspond to in Hough space?
a downward line
In Hough-Transform, where is the line that contains both X0,Y0 and X1,Y1?
It is the intersection of the lines b = -x0m+ y0 and b = -x1m+ y1