Chapter 5 : Machine Learning Flashcards

1
Q

What is machine learning?

A

Computer algorithms that can learn from data to make determinations or predictions on new data without being explicitly programmed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why isn’t search always effective

A
  1. Can’t deal with new data

2. Deal with unforeseen circumstances

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are different types of ML models

A
  1. Classifier (Chooses an output)

2. Regressor (Generates an output)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is difference between discriminative and generative models?

A

Discriminative models draw boundaries in data space, while generative models attempt to map out the distribution of the model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are evaluation metrics? Which one is often best to use?

A

Precision, Accuracy, Recall are main ones but can sometimes give misleading results. F1-Score which is a mix of precision and recall often gives the most meaningful results

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is ROC curve?

A

Can tell how well classifier is working by compare true positive and false positive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is k-cross fold validation?

A

Split data into N-1 folds for training/validation and 1 fold for testing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is leave-one-out validation?

A

Pick N-1 data points for training/validation and 1 last point for testing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the no free lunch theroem?

A

There is no one machine learning algorithm that can be applied to all problems, different models must be tested.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is regression?

A

Fitting data onto a polynomial curve

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do we train regression?

A

Minimizing error function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is univariate linear regression?

A

A regression model with N = 1 and with 1 variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the time complexity of gradient descent?

A

O(n^3)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Possible outcomes of gradient descent?

A
  1. Converges
  2. Diverges
  3. Oscillates
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Logistic Regression

A

Used for classification and giving probability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is Naive Bayes

A

A generative classification model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Can Naive bayes be used in regression?

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is key assumption for Naive Bayes to work?

A

All features are independent of one another

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is Overfitting?

A

When a model has a low error rate in training but then a high error rate in testing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is generalization?

A

Notion of learning from some data to make conclusions based on unseen/excluded/new data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

How to prevent overfitting?

A
  1. Lower order models
  2. More data
  3. Regularization
22
Q

What is bias-variance tradeoff?

A

Variability: Variability of model with respect to inputs
Bias: Difference between average model and average of the target data

23
Q

Can K-nearest neighbor be used as a regressor?

A

True

24
Q

How should k be chosen in K-nearest neighbour?

A
  1. K should be odd to avoid ties
  2. Not too small or else overfit
  3. Not too large or else underfit
25
Q

Advantages/Disadvantage of kNN algorithm?

A

Advantage: Easy to train, intuitive algorithm
Disadvantage: Computationally expensive as dataset grows

26
Q

What is support vector machine?

A

Binary Classifier that splits feature space by hyperplane. Goal is to find kernelized max-margin hyperplane.

27
Q

What do we do if data is not linearly separable?

A

Use kernel trick to project data into higher space

28
Q

What is entropy?

A

The measurement of uncertainty (fair coin has high entropy, bias coin has low entropy)

29
Q

Does the order of features in binary tree make a different?

A

True

30
Q

What is ensemble learning?

A

Create several different classifiers for a problem. Samples (data) are run through all of them and outcome is determined by average or voting.

31
Q

What are two most common methods of ensemble learning?

A

Bagging and boosting

32
Q

What is bootstrapping in bagging method?

A

The concept of randomly sampling from a dataset with replacement, to increase size of dataset

33
Q

What is the assumption of classifiers using ensemble learning?

A
  • Slightly better than chance
  • Somewhat different
  • High variance
34
Q

What is Boosting?

A

Does not use bootstrap sampling instead weak classifiers are trained based off of previous iterations of classifiers

35
Q

How does Adaboost prevent overfitting?

A

Gives more weights to classifications that were done incorrectly.

36
Q

What are random forests?

A

Use many decision trees to make a decision

37
Q

What is bagged tree ensemble?

A

A number of trees are created through bootstrapped-sampled data. The final classification is based on a vote or average.

38
Q

Do boosted trees sample data?

A

While rare, it is possible using scholastic gradient boosted trees or XGBoosted trees

39
Q

What are the main fusion strategies of datasets?

A
  1. Input level fusion
  2. Feature-level fusion
  3. Score-level fusion
40
Q

How is complexity of an ANN (Artificial Neural Network) Determined?

A

Hidden layers

41
Q

What are some of the parameters trained in regular ANN?

A
  1. Type of network
  2. Number of layers
  3. Transfer functions
42
Q

What is Regularization?

A

Technique to avoid overfitting

43
Q

What is the lambda symbol do in regularization?

A

Reduces overfitting (variance) but increases bias (underfitting)

44
Q

What does p symbol mean in regularization?

A

Determines which type of regularization is done (p = 1, p = 2 most common)

45
Q

What is K-means?

A

Unsupervised machine learning algorithm that clusters data

46
Q

How do we train K-means?

A
  1. Select k
  2. Randomly select centroids
  3. Assign data point to closest centroid
  4. Re-calculate center of centroid
  5. Repeat 3,4 until convergence
47
Q

How do we choose K in K-means?

A
  1. Stop after # of iterations

2. When the algorithm change in error becomes very small

48
Q

What is CNN?

A

Convolution Neural Network with many hidden layers. Requires large datasets and large computing resources.

49
Q

Other than convolution layer, what are other types of layer in CNN?

A
  1. Pooling layer (max pooling, average pooling)

2. Fully connected layers

50
Q

What are some parameters that can be trained in CNN?

A
  1. Number of filters
  2. Stride
  3. Linear Spatial Extent
  4. Batch size
51
Q

Is filter proffered to be odd size or even size?

A

Odd size

52
Q

If filter size is N how much does the size of output decrease?

A

By N/2 on each side of output. Output can retain it’s size if it uses zero-padding.