From Object Detection to Instance Segmentation Flashcards

1
Q

What are the levels of computer vision tasks?

A

Low-level (e.g., edge detection), mid-level (e.g., segmentation), and high-level tasks (e.g., object detection).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the difference between semantic and instance segmentation?

A

Semantic segmentation assigns class labels to each pixel, while instance segmentation also distinguishes between different objects of the same class.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the goal of R-CNN?

A

To detect and classify objects in an image by drawing bounding boxes and assigning labels.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the steps involved in R-CNN?

A

Generate region proposals (Selective Search)

Warp proposals

Extract features using CNN

Classify using SVM

Refine bounding boxes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the main drawbacks of R-CNN?

A

It’s computationally expensive due to 2000 forward passes per image, and Selective Search is fixed and non-learnable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How does Fast R-CNN improve on R-CNN?

A

It runs the CNN once per image to create a feature map, then applies RoI pooling for region proposals, speeding up the process.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is RoI Pooling?

A

A method to extract fixed-sized feature maps from arbitrary regions in the image.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the advantages and disadvantages of Fast R-CNN?

A

Pros: Faster than R-CNN by sharing computations.
Cons: Still relies on slow Selective Search for proposals.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the innovation in Faster R-CNN?

A

It introduces a Region Proposal Network (RPN) to generate region proposals directly from feature maps.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How does the RPN work?

A

It slides over the feature map, generating anchors of different scales and aspect ratios, and scores them based on overlap with ground truth.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What happens after the RPN stage?

A

RoI pooling extracts features for each proposal, and R-CNN classifies them into object classes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is U-Net?

A

A convolutional neural network architecture for semantic segmentation that combines downsampling and upsampling paths.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the benefits of U-Net?

A

It preserves location information and supports variable-sized inputs due to the absence of dense layers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is Mask R-CNN?

A

An extension of Faster R-CNN that adds a mask prediction branch for each RoI, enabling pixel-level segmentation of object instances.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the output of Mask R-CNN?

A

Bounding boxes, class labels, and binary masks for each object instance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly