10 ML Algorithms Flashcards

Question 1

Q

Example of Ensemble ML Methods?

Answer

A

Bayesian averaging
bagging
boosting
error-correcting output coding

Question 2

Q

What are the main clustering methodologies?

Answer

A

Centroid-based algorithms
Connectivity-based algorithms
Density-based algorithms
Probabilistic
Dimensionality Reduction
Neural networks / Deep Learning

Question 3

Q

What is Singular Value Decomposition?

Answer

A

In linear algebra, SVD is a factorization of a real complex matrix. For a given m * n matrix M, there exists a decomposition such that M = UΣV, where U and V are unitary matrices and Σ is a diagonal matrix.
PCA is actually a simple application of SVD.

Question 4

Q

Top supervised ML algorithms:

Answer

A

1) Support Vector Machines
2) Ensemble Methods
3) Logistic Regression
4) Ordinary Least Squares Regression
5) Naïve Bayes Classification
6) Decision Trees.

Question 5

Q

Machine learning algorithms can be divided into 3 broad categories — 1, 2 ,3

Answer

A

supervised learning, unsupervised learning, and reinforcement learning.

Question 6

Q

What is Reinforcement learning?

Answer

A

Between supervised and unsupervised. There is some form of feedback available for each predictive step or action, but no precise label or error message.

Question 7

Q

What are Ensemble ML Methods?

Answer

A

Learning algorithms that construct a set of classifiers and then classify new data points by taking a weighted vote of their predictions

Question 8

Q

Applications of Independent Component Analysis (ICA):

Answer

A

digital images,
document databases,
economic indicators
psychometric measurements.

Question 9

Q

What is Principal Component Analysis?

Answer

A

PCA is a statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components.

Question 10

Q

Naïve Bayes Classification examples:

Answer

A

email as spam or not spam
news article about technology, politics, or sports
Used for face recognition software

Question 11

Q

What is clustering?

Answer

A

Clustering is the task of grouping a set of objects such that objects in the same group (cluster) are more similar to each other than to those in other groups.

Question 12

Q

What is an Independent Component Analysis (ICA)?

Answer

A

ICA is a statistical technique for revealing hidden factors that underlie sets of random variables, measurements, or signals.
ICA is related to PCA, but it is a much more powerful technique that is capable of finding the underlying factors of sources when these classic methods fail completely.

Question 13

Q

What is the advantage of SVM?

Answer

A

biggest problems that have been solved using SVMs
large-scale image classification
Used when the number of features is big

Question 14

Q

how do ensemble methods work and why are they superior to individual models?

Answer

A

They average out biases
They reduce the variance
They are unlikely to over-fit

Question 15

Q

What is the advantage of Decision Trees?

Answer

A

As a method, it allows you to approach the problem in a structured and systematic way to arrive at a logical conclusion.

Question 16

Q

Naïve Bayes Classification Components:

Answer

A

P(A|B) is posterior probability,
P(B|A) is likelihood,
P(A) is class prior probability,
P(B) is predictor prior probability.

Question 17

Q

What is Logistic Regression?

Answer

A

It measures the relationship between the categorical dependent variable and one or more independent variables by estimating probabilities using a logistic function, which is the cumulative logistic distribution.

Question 18

Q

Top unsupervised algorithms:

Answer

A

1) Clustering Algorithms
2) Principal Component Analysis
3) Singular Value Decomposition
4) Independent Component Analysis

Question 19

Q

What is Ordinary Least Squares Regression (OLSR)?

Answer

A

Least squares is a method for performing linear regression.
Linear regression as the task of fitting a straight line through a set of points.
Linear refers the kind of model you are using to fit the data, while least squares refers to the kind of error metric you are minimizing over

Question 20

Q

What we do PCA?

Answer

A

compression,
simplifying data for easier learning,
visualization
It is not suitable in cases where data is noisy

Question 21

Q

Logistic Regression applications

Answer

A

Credit Scoring
Measuring the success rates of marketing campaigns
Predicting the revenues of a certain product