Lecture 5 - Segmentation & Recognition Flashcards
What is image segmentation?
mage segmentation is the process of partitioning an image into multiple segments or groups of pixels that represent objects or regions within the image.
Describe the K-means clustering algorithm used in segmentation.
K-means clustering partitions an image into K clusters by randomly initializing K cluster centers, assigning each pixel to the nearest cluster, and iteratively updating the cluster centers until convergence.
Explain the concept of Mixture of Gaussians (MoG) in image segmentation.
MoG models the distribution of pixel intensities as a mixture of several Gaussian distributions, using the Expectation-Maximization (EM) algorithm to estimate the parameters of each Gaussian component and assign pixels to clusters probabilistically.
What is the role of the Expectation-Maximization (EM) algorithm in probabilistic clustering?
The EM algorithm iteratively estimates the parameters of the Gaussian mixtures by alternating between the expectation step (calculating the probability of each pixel belonging to each Gaussian) and the maximization step (updating the Gaussian parameters).
Describe the GraphCuts method for interactive image segmentation.
raphCuts involves modeling the image as a graph with pixels as nodes and edges representing the cost of assigning pixels to different segments. The minimum cut on the graph, computed using max-flow algorithms, determines the optimal segmentation.
What is the Markov Random Field (MRF) and its application in image segmentation?
MRF is a probabilistic model that represents the spatial dependencies between pixels in an image. It is used in segmentation to enforce spatial coherence by modeling the interactions between neighboring pixels.
Explain the K-nearest neighbor (KNN) algorithm in the context of image recognition.
KNN classifies a pixel or region by finding the K nearest training examples in the feature space and assigning the most common label among them. It is simple but can be computationally intensive for large datasets.
What are the challenges in specific object recognition?
Challenges include variations in viewpoint, illumination, occlusion, clutter, and intra-class variation. Effective recognition algorithms must handle these variations to accurately identify objects.
Describe the concept of visual words in object recognition.
Visual words involve quantizing local image descriptors into a discrete vocabulary, analogous to words in a text. This representation allows efficient matching and recognition of objects by comparing histograms of visual word occurrences.
What is the Bag of Words (BoW) model in object category recognition?
The BoW model represents an image by the distribution of visual words it contains. It ignores spatial information but allows efficient and scalable classification by comparing histograms of word occurrences.
Explain the sliding window approach in object detection.
The sliding window approach involves scanning the image with a window at different scales and positions, applying a classifier to each window to detect objects. It is computationally expensive but widely used in practice.
Describe the Random Sample Consensus (RANSAC) algorithm and its role in robust matching.
RANSAC iteratively selects a random subset of data points, fits a model, and tests the number of inliers that fit the model within a tolerance. It is used to robustly estimate parameters by discarding outliers.
Write the formula for updating cluster centers in K-means clustering.
Provide the formula for the probability of a pixel belonging to a Gaussian component in MoG.
Write the formula for the cost function in GraphCuts segmentation.