ML Flashcards
What training is used for datasets without labels?
Unsupervised training
What training is used for datasets with labels?
Supervised
What training is used for smaller datasets with labels and bigger ones wo labels?
Semi-supervised
What training is used when positive and negative feedbacks are given?
Reinforcement
What training is used based on previous knowledge?
Trasnfer
What algorithm predicts the numerical value of a variable based on independent variables?
Linear regression
What algorithm predicts a binary outcome of independent variables?
Logistic regression (binary)
What algorithm predicts multiple outcomes of independent variables?
Logistic regression (multinomial)
What algorithm predicts outcome with order of independent variables?
Logistic regression (ordinal)
What algorithm predicts results on data with curvilineal patterns?
Polinomial regression
What algorithms try to decrease overfitting in linear regression?
Ridge and Lasso regressions
What is the name of how much changing training sets change the model?
Variance
What is the name of the phenomenon that measures how much the training set is different from real world scenarios?
Bias, vies
Simplistic models tend to have what kind of bias variance trade-off?
High Bias and low variance
Complex models tend to have what kind of bias variance trade-off?
Low bias and high variance
Which bias-variance trade off results in Underfitting?
High Bias and low variance
Which bias-variance trade off results in Overfitting?
Low Bias and high variance
What is the name of the specific table layout that allows visualization of the performance of an algorithm in ML?
Confusion matrix
In an error matrix, what is the name of a prediction that is 1 when the actual classification is 0?
False positive
In an error matrix, what is the name of a prediction that is 0 when the actual classification is 1?
False negative
Which dimension reduction algorithm is used to simplify the data model while keeping the variance?
Principal component analysis PCA
What kind of supervised algorithm is used for classification and regression of multi-dimensional data models that cannot be separated linearly?
Support Vector Machines SVM
Which algorithm uses multiple decision trees trained on different subsets of data and then use a combination of their results?
Random forests
Which clustering algorithm groups elements in levels so that it iterates grouping or dividing clusters in a tree structure?
Hierarchical clustering