General Cards Flashcards

1
Q

Classification

A

A supervised learning task where the goal is to assign input data points to predefined discrete categories or classes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Regression

A

A supervised learning task where the goal is to predict a continuous numerical output value for a given input data point

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Supervised Learning

A

A machine learning paragdim where an algorithm learns from labeled data (input-output pairs) to map inputs to outputs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Feature Scaling

A

A preprocessing step which transforms the range of features to a similar scale, which can be essential for some machine learning models

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Bias-Variance Decomposition

A

A way to analyze the generalization error of a model by breaking it down into noise, bias squared, and variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Generalization

A

The ability of a trained machine learning model to perform well on new, unseen data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Overfitting

A

A phenomenon where a model learns the training data too well, including the noise, leading to poor performance on new data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Regularization

A

Techniques used to reduce overfitting by adding a penalty to the model’s loss function, which discourages overly complex models

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

L1 Regularization (LASSO)

A

A type of regularization proportional to the absolute value of the model’s coefficients, often leading to sparse models

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

L2 Regularization (Ridge)

A

A type of regularization that adds a penalty proportional to the squared value of the model’s coefficients, shringing them towards zero

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Support Vector Machine (SVM)

A

A supervised learning model which aims to find a hyperplane to separate data points with the largest margin between classes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Margin (SVM)

A

The distance the separating hyperplane and the closest data points

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Kernel Trick

A

A technique used in kernel methods, such as SVMs, which implicitly maps data into a higher dimension feature space with kernel functions.

Allows for learning of non-linear decision boundaries without explicitly computing the transformation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Kernel Function

A

A function which computes the inner product between two data points in a potentially high-dimensional feature space

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Decision Tree

A

A tree-like supervised learning model where each internal node represents a test on an attribute, each branch represents the outcome of that test, and each leaf node represents a class label (classification) or prediction (regression)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Ensemble Learning

A

A ML technique which combines the predictions of multiple models to improve overall performance and robustness

17
Q

Bagging (Bootstrap aggregation)

A

An ensemble learning technique where each model is trained on different bootstrap samples of the training data, and their predictions are aggregated

18
Q

Boosting

A

An ensemble learning technique where models are trained sequentially, with each new model attempting to correct the mistakes of the previous models

19
Q

Random Forest

A

An ensemble learning method that builds multiple decision trees and merges their predictions to get a more accurate and stable prediction using bootstrapped samples and randomly selected features for each model

20
Q

Gradient Boosting

A

A boosting algorithm that builds trees sequentially, where each new tree predicts the residual errors of the previous trees through optimization to minimize the errors

“building a strong learner through the combination of multiple weak learners”

21
Q

Image Segmentation

A

The process of partitioning a digital image into multiple segments to simplify the image or convert it into something easier to analyze

22
Q

Multiple Kernel Learning (MKL)

A

A machine learning technique which kernels are used to make a model more efficient when using multimodal data

23
Q

Sparsity

A

A property of a model where many of it’s parameters are zero, meaning only a subset of features are relevant for making predictions

24
Q

Structured Sparsity

A

Sparse model where non-zero parameters exhibit some form of structure or pattern, often based on prior knowledge about the relationships between features

25
Total Variation (TV) Regularization
A regularization technique often use in image processing that promotes piecewise constant solutions by penalizing the total variation of the signal
26
Laplacian Regularization
A regularization technique that promotes smooth solutions by penalizing the differences between the values of neighboring features
27
Principal Component Analysis (PCA)
An unsupervised dimensionality reduction technique that finds orthogonal linear combinations of the original features (principal components) that capture the maximum variance in the data
28
Unsupervised Learning
A machine learning paragdim where the algorithm learns patterns from unlabeled data without explicit output labels
29
Clustering
An unsupervised learning task where the goal is to group similar data points together into clusters based on intrinsic properties
30
K-means
A popular unsupervised clustering algorithm that aims to partition the data into K clusters by iteratively assigning data points to the nearest centroid and updating the centroids
31
Hierarchical Clustering
An unsupervised clustering algorithm that builds a hierarchy of clusters, either in a bottom-up (agglomerative) or top-down (divisive) manner
32
Gaussian Mixture Model (GMM)
A probabilistic model that assumes all data points are generated from a mixture of finite gaussian distributions with unknown parameters.
33
Expectation Maximation (EM) Algorithm
An algorithm commonly used to estimate parameters within a gaussian mixture model
34
Partial Least Squares (PLS)
A statistical method that aims to find the fundamental relations between two blocks of data by finding directions in each space with the maximum covariance with each other
35
Canonical Correlation Analysis (CCA)
A multivariate statistical method for finding linear combinations of two sets of variables that are maximally correlated with each other
36
t-SNE
A non-linear dimensionality reduction technique primarily used for visualizing high-dimensional datasets in a low-dimensional space by preserving the local structure of the data