Machine Learning Flashcards

1
Q

Hierarchical clustering is most likely used when the problem involves

A

Classifying unlebeled data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is Supervised machine learning

A

Involves training an algorithm to take a set of inputs (x variables) and find a model that best relates them to outputs (Y variables)

Training algorithm - Set of inputs - find models that relates to outputs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is unsupervised machine learning

A

Same as supervised learning, but does not make use of labeled training data.

We give it data and expect the algorithm to make sense of it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is Overfitting

A

ML models can produce overly complex models that may fit the training data too well and thereby not generalize new data well.

The prediction model of the traning sample (in-Sample data) is too complex.

The traning Data does not work well with the new data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Name Supervised ML Algorithms

A

Penalized regression
Support Vector Machine (SVM)
K - Nearest Neighbor
Classification and Regression Trees (CART)
Ensemble learning
Random Forest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Name unsupervised ML Algortihms

A

Principle component analysis
K-Mean clustering
Hierarachical clustering

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

High Bias Error in ML

A

High Bias Error means the model does not fit the training data well.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

High Variance Error in ML

A

High variance error means the model does not predict well on the test data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Name Dimension Reduction in ML

A

Principle component analysis (unsupervised ML)

Penalized Regression
(Supervised ML)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does Penalized Regression do?

A
  • Simmilar to maximizing adjusted R square.
  • Demension Reduction
  • Eliminates/minimazie overfitting

Regression coefficients are chosen to minimize the sum of the squared error, plus a penalty term that increases with the number of included features

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is SVM

A

Support Vector Machine

It is Classification, Regression, and Outlier detection
Classifying data that is not complex or non-linear.

Is a linear classifier that determines the hyperplane that optimally seperates the observation into two sets of data points.

Does not requier any hyperparameter.

Maximize the probability of making a correct prediction by determining the boundry that is furthest from all observation.

Outliers do not affect either the support vectors or the discriminant boundry.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is K-Nearest Neighbor

A

Classification

Classify new observation by finding similarities in the existing data.

Makes no assumption about the distribution of the data.
It is non-parametric.

KNN results can be sensitive to inclusion of irrelevant or correlated featuers, so it may be neccessary to select featuers manually.
Thereby removing less irrelevant information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is CART

A

Classification and Regression Trees

Part of supervised ML

Typically applied when the target is binary.

If the goal is regression, the prediction would be the mean of the values of the terminal node.

Makes no assumption about the characteristics of the traning data, so if left unconstrained, potentially it can perfectly learn the traning data.

To avoid overfitting, regulation paramterers can be added, such as the maximum dept of the tree.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the 3 types of layer in Neural Network

A
  1. Input layer
  2. Hidden layer
  3. Output layer
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are non-linear functions more susceptiable to?

A

Variance error and overfitting

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are linear functions more susceptiable to?

A

Bias error and underfitting

17
Q

The main distinction between clustering and classification algorithms is that

A

The groups in clustering are determined by the data

Classification they are determined by the analyst/researcher

18
Q

What is K-Means clustering in ML?

A

K-means partitions observations into a fixed number, k, of non-overlaping cluster.

Each cluster is characterized by its centroid, and each observation is assigned by the algorithm to the cluster with the centroid to which that observation is closest.

19
Q

High bias error and high variance error are indicative of…

A

Underfitting

High bias error = model does not fit on the traning data.

High variance = Model does not predict well on test data.

Both combination results in a underfitted model.

20
Q

Low bias error but high variance error is indicative of ..

A

Overfitting

Bias error = model does not fit the traning data well.

Variance error = Model does not predict well on test data.

21
Q

What are linear models more susceptible to?

A

Bias Error (underfitting)

22
Q

What are non-linear models more prone to?

A

Variance Error
(overfitting)

23
Q

What is Principal Components Analysis

A

It is part of unsupervised ML
Dimension Reduction

Use to reduce highly correlaed featuers of data into few main uncorrelated composite variables.

24
Q

What are the 3 types of error in ML?

A

Bias error
Variance error
Base error

25
Q

What is variance error in ML?

A

Variance Error or how much the model’s results change in response to new data from
validation and test samples.

Unstable models pick up noise and produce high variance
causing overfitting and ↑ out of-sample error.

26
Q

What is Bias error in ML?

A

Bias Error or the degree to which a model fits the training data.
Algorithms with erroneous assumptions produce high bias with poor approximation, causing underfitting and ↑ in-sample error.

(Adding more training samples will not improve the model)

27
Q

What is Bias error in ML?

A

Base Error due to randomness in the data.

(Out-of-sample accuracy increases as the training sample size increases)

28
Q

Name 2 ways to Preventing Overfitting in Supervised Machine Learning

A

Ocean’s Razor: The problem solving principle that the simplest solution tends to be the correct one.

In supervised ML, it means preventing the algorithm from getting too complex during selection and training by limiting the no. of features and penalizing algorithms that are too complex or too flexible by constraining them to include only parameters that reduce out-of-sample error.

K-Fold Cross Validation: This strategy comes from the principle of avoiding sampling bias.
The challenge is having a large enough data set to make both training and testing possible on representative samples.