Medical Image Computing Flashcards

Question

You have to binarize an image based on intensity only. How do you proceed to perform the task by setting a manual threshold? Which approach would you choose for an automated segmentation for this problem?

Answer 1

Manual approach for threshold: Histogram -> Minima = Threshold Eventually smoothing first to remove noise Automated approach: Otsu method: Uses grey-value histogram of the given image as input and aims at providing the best threshold in a sense that the overlap between two classes (set of object and background pixels) is minimized

Answer 2

= Allow model-based segmentation based on regional parameters To segment an image where the data are distorted by noise or artefacts * Represent boundary as a parametric curve * The curve is associated to an energy function E * Snake smoothly follows high intensity gradients if they reliably reflect the object boundary * Internal forces: Designed to keep the model smooth during deformation * External forces: Move the model towards an object boundary or other desired feature within an image * Energy minimization based on internal (contour properties) and external energy (image)

Answer 3

= Method for seeking clusters in data based on observations, e.g. intensity

Answer 4

Representation can help to find features (i.e. edges) Image graph: * Segmentation of the image as weighted, (un)directed graph * Pixels are nodes * Edge is connection between pairs of nearby pixels * Every node is connected to its 4 (or 8) x-y neighbours Goal of segmentation using graphs: Order the nodes in diverse subsets, where similarity within subset is big and out of subset is low Examples: Graph cut image segmentation, Watershed segmentation, Cell segmentation, Dijkstra‘s shortest path algorithm (Compute minimum cost path from one seed to all other pixels)

Answer 5

Based on execution time and correctness of the result: * Accuracy: ability of the method to mirror standard of reference measurements or diagnosis. * Precision: ability of the method to provide repeatable measurements i.e. low variability due to noise, different acquisition procedure * Specificity: Ratio of negative cases correctly classified as negative * Sensitivity: Ratio of true cases correctly classified as true

Answer 6

Stitching: MR/MR, ... of separate times, Photo stitching Multi-modal imaging: PET/CT, Augmented reality / image guided surgery

Answer 7

= Modify the spatial arrangement of pixels in an image (Ändern räuml. Anordnung), e,g. map features in one image to another Two basic operations: ▪ Spatial transformation of coordinates ▪ Intensity interpolation Kinds of transformations: ▪ Rigid: Translation, rotation ▪ Affine (combination in single matrix): translation (along transformation vector), rotation (around origin), scaling (isotropic or anisotropic: diff. for x & y), shear (= Kippen) ▪ Deformable/non-rigid: all transformation possible (deformation)

Answer 8

Feature-based registration: * Find correspondence between image features: points, lines, contours * Algorithms based on: ▪ Distance between corresponding points ▪ Similarity metric between feature values, e.g. curvature-based registration Intensity-based registration: * Align the entire images to match up the colours/grey value of as many pixels as possible * No landmark or feature selection is necessary * Algorithms: ▪ Registration by minimizing ▪ Intensity difference ▪ Correlation techniques ▪ Ratio image uniformity ▪ Partitioned Intensity ▪ Uniformity * Needed: Pixel-by-pixel error measure, Mapping technique (transformation), Minimization technique

Answer 9

Images can be viewed as probability distributions; calculated via joint histograms Frequency of corresponding intensity pairs can be interpreted in terms of probabilities

Answer 10

Similarity measure = quantifies degree of similarity between intensity patterns in two images Multimodal * Mutual information * Normalized mutual information Same modality * Joint entropy * Cross-correlation * Sum of squared intensity differences * Ratio image uniformity

Answer 11

Using the Gradient descent Repeat: ▪ Compute gradient ▪ Make step in gradient direction (not too small, not to big) ▪ Update mapping equation ▪ Remap image Until convergence ▪ Ideally when gradient = 0

Answer 12

= Transformation of moving (non-rigid) image based on deformation Contrast-enhanced breast MR Transformation in 3D: * Motion of breast is non-rigid (moving) * Approach: Develop a combined transformation which consists of global and local transformation * Overall motion of the breast. Affine transformation in 3D: Rigid (6 degrees of freedom) + Scaling and shearing (6 degrees of freedom)

Answer 13

= process of defining a set of features, or image characteristics, which will most efficiently or meaningfully represent the information that is important for analysis Consists of ▪ Feature detection, i.e. finding features such as corners ▪ Feature description, i.e. quantification of attributes such as corner orientation Example: Extract malignant lesion of CT scan lung

Answer 14

* Feature vectors ▪ Package/container for numerical descriptors ▪ Column vector (d x 1) or row vector (1 x d) E.g. each pixel is a 3D (RGB) colour feature vector (R value, G value, B value) Or (length; mean; variance; intensity) = 4x1 = f_B1 * Feature space ▪ Contains a point cloud of feature vectors ▪ d-dimensional Example: (f_B1, f_B2, f_B3, f_B4, f_B5) = 4x5 Matrix Invariance: = Features should be invariant (insensitive) to variations (“Invariance to intensity transformations”) ▪ in: Scale, translation, rotation, illumination and viewpoint Value of the feature does not change after the application of the given transformation family => features(transform(image)) = features(image) Co-variance: = Feature changes by the same amount (“Covariance with geometric transformations”) ▪ E.g. Feature Area is covariant with scaling ▪ E.g. Feature Direction is covariant with rotation Covariant detection: If we have two transformed versions of the same image, features should be detected in corresponding locations => features(transform(image)) = transform(features(image))

Answer 15

= Representation of boundary by a set of connected straight lines with defined direction and length * Offers a unified way to analyse the shape of a boundary * Separately encodes each connected component * Application: compression (lossless), recognition

Answer 16

Method: Medial representations/Skeletonization: Thinning (Medial axis transform): * Original shape can be fully reconstructed * Set of points that are equidistant from the boundary * Like boundaries, skeletons/blood vessels are related to the shape of a region * Reduce a region to a tree or a graph

Answer 17

* Length: number of pixels along the boundary * Diameter = max distance where p1 and p2 are points on the boundary * Major axis = line segment of length equal to the diameter and connecting two points on the boundary * Minor axis = line perpendicular to the major axis and of such length until intersection of the boundary * Basic rectangle (bounding box) by minor and major axis intersections with circle * Eccentricity is the ratio of major to minor axis

Answer 18

Basic descriptors: area, perimeter, compactness, mean value, circularity, effective diameter, eccentricity, normalized area

Answer 19

Statistical approaches ▪ Smooth, coarse, grainy, regularity … Spectral approaches ▪ Global periodicity ▪ Fourier spectrum: High energy, narrow peaks, …

Answer 20

The histogram does not carry information about the spatial relationship of pixels (räuml. Beziehung) Need to take into account relative positions of pixels => Haralick features  Grey level co-occurrence matrix (GLCM) * Matrix G were measured the number of times intensities occur in a specified position * Position specified with a position operator Q ▪ E.g. Q = “one pixel immediately to the right” ▪ E.g. Q = "one pixel to the right and one pixel above”

Answer 21

= Set of 14 textural features which can be extracted from the co-occurrence matrix * Contain information about image texture characteristics such as homogeneity, linearity, contrast * Four directions of adjacency, calculate GLCM for each direction

Answer 22

* Operator returns a discrete value at each pixel that characterizes the local texture partially invariant to luminance changes * Compares eight neighbourhood pixel intensities to the centre pixel intensity: ▪ 0 if intensity is less then centre pixel ▪ 1 if intensity is greater or equal than centre pixel * Orientation invariant ▪ Shift to minimum * Reduce classes by aggregation ▪ E.g. uniform vs. Non-uniform: 00000000 vs 01010101

Answer 23

A design matrix is a matrix containing data about multiple characteristics of several individuals or objects. Each row corresponds to an individual and each column to a characteristic. Gather the feature descriptors for a single image which then can be used for further processing (e.g. kmeans) => here just one feature vector for each object (4x5x1) Features on one axis, classes/samples on other axis

Answer 24

Given example vectors (design matrix) 2. Calculate covariance matrix, and its eigenvector decomposition 3. Represent data in the new space with reduced dimensionality (also Autoencoder)  Reduce the dimensions of data in space => large feature vectors (Get rid of useless/redundant information) ▪ Feature selection (Visualization) ▪ Shape normalization ▪ Model shape ▪ Classification * To visualize high dimensional datasets * Each example is represented by a vector containing the values for all features (E.g. 3 for colour vectors)

Answer 25

Features to be extracted from whole images * Corner detection Corners = Junctions of contours / Large local gradient in multiple directions Change of intensity for the displacement of a patch (Intensitätsänderung für die Verschiebung) Harris corner detection: ▪ Look for local patch that produces noticeable difference when moved around ▪ Change of intensity for the displacement [u,v] * SIFT features Applications ▪ Image matching purposes (registration, mosaicking): patches containing a corner has distinct local features ▪ Seed point selection

Answer 26

= Scale-Invariant Feature Transform = Local histogram-based descriptor Outline: Input = Image, Output = feature vector * Detection of keypoints (= SIFT features) * Refine selection of keypoints * Describe each keypoint using a feature vector Properties of SIFT feature descriptor: * Invariant to uniform scaling, orientation, illumination changes, and partially invariant to affine distortion * Various feature often combined * Useful in the application of image registration/matching, mosaicking Preferred method for: * Scale changes, rotations, changes in illumination and/or viewpoint * Applications: object recognition, image stitching, 3D modeling, gesture recognition, video tracking, individual identification, …

Answer 27

Limitation: hard to maintain for complex problems Machine learning:  Allow to learn from data and improve the program in fluctuating environments  Can help humans learn, by inspecting to see what they have learned (although this can be tricky for some algorithms)  Getting insights about complex problems and large amounts of data

Answer 28

Unsupervised learning: No need for human supervision, Training data is unlabelled, model identifies structures like clusters E.g. k-Means, principal component analysis PCA, association rule learning, GAN Supervised learning: Human supervision, Training data fed to algorithm includes known labels, model learns decision boundarys and replicates labelling E.g. k-Nearest neighbours, linear and logistic regression, support vector machines (SVM), decision trees and random forests, neural networks

Answer 29

… are able to learn from data. „A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P improves with experience E.“ Task: how the system should process an example E.g. Classification, Regression, Denoising, Anomaly detection Performance: evaluate the abilities of the ML algorithm E.g. ROC curve, Confusion matrix (with FP, FN, TP, TN, accuracy, specificity, sensitivity) Experience: kind of experience during learning E.g. supervised, unsupervised, design matrix, …

Answer 30

Distinct datasets: * Training set: Data used to train the model * Tuning set: Used to evaluate the performance of different models and used to find the hyperparameter values * (Validation) test set: Used to evaluate the performance with unseen data * (External validation test set)

Answer 31

Cross-validation: If small dataset => splitting the dataset into training and test set randomly multiple times Most common: k-fold cross validation * Dataset is split into k non-overlapping subsets * k trials are performed * Average test error of k trials is taken

Answer 32

= Performance can be reduced when the amount of features is increased (small number of examples)  Rule of thumb: number of examples/number of features > 10

Answer 33

A classifier in machine learning is an algorithm that automatically orders or categorizes data into one or more of a set of “classes.”[Link] Different classifiers with different approaches to get boundary from training data * Capture distribution of training data faithfully * Be fast during training * Be fast during classification of new data * Deal with noise in the training data * Deal with imperfect feature extractors Over-fitting: the classifier is too faithful to the training data, but doesn’t reflect the underlying distribution anymore (Low bias - high variance)  Regularization Under-fitting: the classifier doesn’t reflect the training data or the underlying distribution (High bias - low variance)  Possibly we need a classifier with more parameters

Answer 34

Assign class label based on training example with lowest distance: NN classifier (also k-nearest neighbour, minimum distance classifiers) Principle: * Training set of labelled prototypes and classes * An example to be classified * Find prototype which is closest to this * Example gets label of prototype

Answer 35

Feature vectors mainly numerical vectors (E.g. length, width, intensity) Advantages: * Easy to understand and interpret * Fast * Versatile and powerful: Minor changes in training set can cause major changes in the classification Design considerations CART: 1. Branching factor: How many splits should be attached to a node, two (binary) or more? 2. Query selection: Which feature should be tested at a node? 3. Stopping criteria: When does a node should be transferred into a leave? 4. Pruning: If tree gets too big, how can it be made smaller (pruning)? 5. Classification: If a leaf is not pure, how is a decision made?

Answer 36

* Aggregate predictions of group of predictors (regressors, classifiers) are often superior than best individual predictor * Group of predictors is called ensemble → ensemble learning Example: Group of decision trees – random forest

Answer 37

Autoencoders, PCA Principal component analysis

Answer 38

X is a feature, Y is a class membership * We want to know P(X|Y) => Probability: feature X belongs to class Y * Conditional probability: If we know P(X) we can compute P(X|Y) using Bayes’ rule

Answer 39

= Special case of a linear, binary classifier * Assuming a n-dimensional feature vector and two classes * Find a discriminant function which codes the class labels * Goal: Find weights and bias * Is guaranteed to converge if data are linearly separable Limitations: Good for linearly separable classes  Fails for typical XOR problem (not linear separable), Solution – multiple perceptrons

Answer 40

= Interconnected perceptron-like computing elements called artificial neurons * Interconnected neurons are organized in layers where the output of one layer provides the input of the following layer (Hence, a sensitive activation function would affect all subsequent layers) * Activation function make a neuron to “fire” and must be differentiable “Feedforward to next layer”: neuron in hidden layer l goes to all neurons in layer l+1

Answer 41

* Weights W * Biases B * Activation functions h Training: use sets of training patterns to estimate these parameters Tool “backpropagation”: 1. Inputting the pattern vectors 2. Forward pass through the network and determine the classification error 3. A backward (backpropagation) pass that feeds output error back to estimate required changes 4. Updating the weights and biases in the network

Answer 42

Convolutional neural networks: * Have additional convolutional layers and pooling layers before the NN => Purpose: Automatically detecting features * Accept (training) images directly as inputs * No pre-defined features by human experts needed  CNNs learn directly from raw image data FFN Networks: Convert images to vectors = patterns organized in feature vectors, loss of spatial relationship

Answer 43

CNN layer: Collection of feature maps (One feature map: Out of receptive field = Neighbourhoods of image; with same weights and single bias) Pooling = Downsampling to reduce spatial resolution (for memory + comp. power) * Typically 2x2 (no overlap) * Pooling methods: Average, Max-pooling, L2 pooling

Answer 44

Deep CNN networks: Use of pre-trained models U-Net: Convolutional network for biomedical image segmentation  Works with only a few annotated images

Answer 45

Goal: Given training data set, generate new samples from same distribution * We train a model from samples drawn from a distribution * It learns an estimate of this distribution

Answer 46

Learn some underlying structure of data Examples: Clustering dimensionality reduction, density estimation,...

Answer 47

Autoencoders = Neural networks capable of learning efficient representations of the input data * Encoder (recognition network) > Decoder (generative network) * Used for: Dimensionality reduction, Feature, detection, generative modelling GAN = Make neural networks to compete each other and pushing them to excel (=übertreffen) * Generator and Discriminator * Used for generative modelling: Anomaly detection, data augmentation, super-resolution

Answer 48

= Aims to use imaging to improve the localization and targeting of diseased tissue, Monitor and control treatments In image-guided surgery and interventional radiology Steps: * Prior to procedure diagnostic imaging is performed * Images are converted and modelled, e.g. in 3D, to represent the patient’s anatomy * Information is used for Pre-operative surgical planning Intraoperative surgical decision making

Answer 49

Individual treatment check and endpoints – image guided therapy (E.g. Macular Edema Recurrence) * Dose optimization, optimal efficacy * Timely treatment * Prediction of treatment response Main challenge: find predictive markers for future disease progression and treatment response

Answer 50

= High-throughput extraction and analysis (mining) of radiological image data * Extraction of quantitative features from imaging data such as MR or CT * Bridge between medical imaging and personalized medicine * Associate features with predictive goals: Diagnosis, Prognosis Motivation: Image guidance to support clinical decisions (radiations, surgery, treatment)

Medical Image Computing Flashcards

(74 cards)