03 - Datasets Flashcards
1
Q
Famous datasets
A
- MNIST Digits
- ImageNET (WordNet hierarchy)
- BRATS (brain tumors)
2
Q
Types of datasets
A
- classification: 1 img = 1 value
- regression: 1 img = predict 1 or more decimal values
- segmentation: 1 pixel = 1 value
- detection: group of pixels = 1 obj
3
Q
Common issues with datasets
A
- imbalance
- too homogeneous
- bias
- too few examples
4
Q
Augmenting datasets
A
- transform = shift, zoom, rotate, intensity, color, shear
- other = noise, blurring
- /!\ how much distortion is acceptable
5
Q
Baseline
A
= simple, easily implemented and understood model that illustrates the pb and the ‘worst case scenario’ for a model that learns nothing
- nearest neighbor (1 or k) is usually a good baseline