TG273- recommendations on AI Flashcards

Question 1

Q

issues of most of the studied sone so far

Answer

A

high risk for bias

Question 2

Q

goal of TG273

Answer

A

bring attention to proper training and validation of machine learning algorithms

Question 3

Q

tips for data collection

Answer

A

-data collected should include the intended use popluation
-use public databases so there is more data

Question 4

Q

data augmentation

Answer

A

create alterations of the training data or to create synthetic data to increase the training set size

Question 5

Q

data harmonization

Answer

A

Data may include images obtained at different sites, acquired with different equipment and image-acquisition parameters, and reconstructed and/or post-processed using different algorithms. These differences may result in systematic variations across images. Data harmonization aims to reduce these variations retrospectively after acquisition while preserving the biological variability captured in the images

Question 6

Q

reference standard

Answer

A

defining reference standard can be subjective- increasing number of doctors who define the standard helps to reduce subjectivity

Question 7

Q

annotations

Answer

A

level of annotation granularity or details, depends on the task
-entire image
-region based
-pixel based

Question 8

Q

methods for acquiring annotations

Answer

A

expert labels: subjective reference standard from human domain experts
-electronic health record
-crowd sourcing: switch from domain experts to many, less experienced users- not recommended for general use
-phantoms

Question 9

Q

defining true positive is difficult

Question 10

Q

3 partitions for dataset

Answer

A

-training, validation, and test sets
-validation is part of training
-testing is done on completely independent data set

Question 11

Q

types of supervised learning

Answer

A

In supervised learning, a model is trained to map input data to output data based on explicit examples of the desired input-output pairs, as provided by the user.

Semi-supervised learning algorithms exploit a combination of labeled and unlabeled data. In this case, the model is given some guidance about the desired outcome, but the annotations do not need to be as detailed or extensive as those used with supervised learning.

Self-supervised learning can exploit large unlabeled datasets for feature representation and has a regularizing effect on the learned features

Unsupervised learning refers to a class of algorithms that can autonomously learn from data without reference to any labels or any instruction from the user.

Question 12

Q

transfer learning

Answer

A

Transfer learning in DCNNs is commonly implemented by training a network on one task and then “transferring” the parameters (or weights) from the trained model to initialize the network for a new task, rather than randomly initializing it (also known as “training from scratch”).

Question 13

Q

multi-task learning

Answer

A

Multi-task learning is a special type of transfer learning in which a DCNN is trained to jointly learn interrelated tasks, as opposed to addressing each task sequentially

Question 14

Q

federated learning

Answer

A

Federated learning is a distributed machine learning approach that enables collaborative training on decentralized datasets.121-124 Each site trains the model locally with its own dataset and then only the trained model parameters are shared, thus producing a global model benefiting from access to a large corpus of data without requiring data sharing and without posing risks to patient privacy. There are, however, several open-ended questions with regard to federated learning that are relevant to medical imaging.125, 126 In particular, there is no formalized training protocol yet to guarantee that the performance of a model trained with federated learning is comparable to that of a centralized trained model with access to all the data.127 Also unknown is (1) the extent to which local model overfitting negatively impacts the global model, and (2) the tradeoff between access to more data through a federated process versus traditional learning with a fully controlled dataset.

Question 15

Q

continuous learning

Answer

A

Continuous or “life-long” learning emulates the human ability to continuously learn and adapt as new data are presented.128, 129 Theoretically, continuously learning AI systems can accelerate model optimization and continuously improve their performance by taking advantage of new data presented during clinical use. In practice, adaptive training of shallow and deep neural networks using incrementally available data generally results in rapid overriding of their weights, a phenomenon known as “interference” or “catastrophic forgetting.

Question 16

Q

performance assessment metrics

Answer

A

ROC
-involves benchmarking the algorithm performance and assessing the added value to the end user
-assess clinical impact
-assess reproducibility

Question 17

Q

explainable AI

Answer

A

-emerging machine learning area
-desaigns interpretable AI models or provides post-hoc explanations for trained models