Chapter 5 : Machine Learning Flashcards

1
Q

What is machine learning?

A

Computer algorithms that can learn from data to make determinations or predictions on new data without being explicitly programmed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why isn’t search always effective

A
  1. Can’t deal with new data

2. Deal with unforeseen circumstances

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are different types of ML models

A
  1. Classifier (Chooses an output)

2. Regressor (Generates an output)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is difference between discriminative and generative models?

A

Discriminative models draw boundaries in data space, while generative models attempt to map out the distribution of the model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are evaluation metrics? Which one is often best to use?

A

Precision, Accuracy, Recall are main ones but can sometimes give misleading results. F1-Score which is a mix of precision and recall often gives the most meaningful results

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is ROC curve?

A

Can tell how well classifier is working by compare true positive and false positive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is k-cross fold validation?

A

Split data into N-1 folds for training/validation and 1 fold for testing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is leave-one-out validation?

A

Pick N-1 data points for training/validation and 1 last point for testing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the no free lunch theroem?

A

There is no one machine learning algorithm that can be applied to all problems, different models must be tested.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is regression?

A

Fitting data onto a polynomial curve

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do we train regression?

A

Minimizing error function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is univariate linear regression?

A

A regression model with N = 1 and with 1 variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the time complexity of gradient descent?

A

O(n^3)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Possible outcomes of gradient descent?

A
  1. Converges
  2. Diverges
  3. Oscillates
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Logistic Regression

A

Used for classification and giving probability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is Naive Bayes

A

A generative classification model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Can Naive bayes be used in regression?

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is key assumption for Naive Bayes to work?

A

All features are independent of one another

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is Overfitting?

A

When a model has a low error rate in training but then a high error rate in testing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is generalization?

A

Notion of learning from some data to make conclusions based on unseen/excluded/new data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

How to prevent overfitting?

A
  1. Lower order models
  2. More data
  3. Regularization
22
Q

What is bias-variance tradeoff?

A

Variability: Variability of model with respect to inputs
Bias: Difference between average model and average of the target data

23
Q

Can K-nearest neighbor be used as a regressor?

24
Q

How should k be chosen in K-nearest neighbour?

A
  1. K should be odd to avoid ties
  2. Not too small or else overfit
  3. Not too large or else underfit
25
Advantages/Disadvantage of kNN algorithm?
Advantage: Easy to train, intuitive algorithm Disadvantage: Computationally expensive as dataset grows
26
What is support vector machine?
Binary Classifier that splits feature space by hyperplane. Goal is to find kernelized max-margin hyperplane.
27
What do we do if data is not linearly separable?
Use kernel trick to project data into higher space
28
What is entropy?
The measurement of uncertainty (fair coin has high entropy, bias coin has low entropy)
29
Does the order of features in binary tree make a different?
True
30
What is ensemble learning?
Create several different classifiers for a problem. Samples (data) are run through all of them and outcome is determined by average or voting.
31
What are two most common methods of ensemble learning?
Bagging and boosting
32
What is bootstrapping in bagging method?
The concept of randomly sampling from a dataset with replacement, to increase size of dataset
33
What is the assumption of classifiers using ensemble learning?
- Slightly better than chance - Somewhat different - High variance
34
What is Boosting?
Does not use bootstrap sampling instead weak classifiers are trained based off of previous iterations of classifiers
35
How does Adaboost prevent overfitting?
Gives more weights to classifications that were done incorrectly.
36
What are random forests?
Use many decision trees to make a decision
37
What is bagged tree ensemble?
A number of trees are created through bootstrapped-sampled data. The final classification is based on a vote or average.
38
Do boosted trees sample data?
While rare, it is possible using scholastic gradient boosted trees or XGBoosted trees
39
What are the main fusion strategies of datasets?
1. Input level fusion 2. Feature-level fusion 3. Score-level fusion
40
How is complexity of an ANN (Artificial Neural Network) Determined?
Hidden layers
41
What are some of the parameters trained in regular ANN?
1. Type of network 2. Number of layers 3. Transfer functions
42
What is Regularization?
Technique to avoid overfitting
43
What is the lambda symbol do in regularization?
Reduces overfitting (variance) but increases bias (underfitting)
44
What does p symbol mean in regularization?
Determines which type of regularization is done (p = 1, p = 2 most common)
45
What is K-means?
Unsupervised machine learning algorithm that clusters data
46
How do we train K-means?
1. Select k 2. Randomly select centroids 3. Assign data point to closest centroid 4. Re-calculate center of centroid 5. Repeat 3,4 until convergence
47
How do we choose K in K-means?
1. Stop after # of iterations | 2. When the algorithm change in error becomes very small
48
What is CNN?
Convolution Neural Network with many hidden layers. Requires large datasets and large computing resources.
49
Other than convolution layer, what are other types of layer in CNN?
1. Pooling layer (max pooling, average pooling) | 2. Fully connected layers
50
What are some parameters that can be trained in CNN?
1. Number of filters 2. Stride 3. Linear Spatial Extent 4. Batch size
51
Is filter proffered to be odd size or even size?
Odd size
52
If filter size is N how much does the size of output decrease?
By N/2 on each side of output. Output can retain it's size if it uses zero-padding.