Lecture Notes 2 Flashcards

1
Q

What are the three methods of learning?

A

Trial and error, listening to others, watching others

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What can models in Machine Learning be?

A

Tree diagram, neural network, collection of examples

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the three tasks a model is used for?

A
  • Describe the samples in building the model
  • Predict something about unseen data
  • Generate new data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is Supervised Learning? What are the 2 types?

A

Takes in labeled data s = <i,o> and makes model M such that M(i) -> o. Classification & Regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the two main types of Supervised Learning?

A
  • Classification - find boundaries to separate classes
  • Regression - find best fitting line to predict outcomes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does Unsupervised Learning focus on?

A

Only gets samples s = <i> and finds relationships between data points</i>

Clustering, describe groups based on similarities, outlier detection, density estimation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the goal of Semi-supervised Learning?

A

To build clusters of unlabeled data and label them using the labeled points

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is Reinforcement Learning?

A

Maps states ‘s’ to actions ‘a’ to optimize life, using a policy π(s) -> a

Doesn’t use data points for models, tries various actions and receives reward/punishment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does Generalization in Machine Learning refer to?

A

Tigers Example
Leads to stereotyping and involves overfitting (under-gen) and underfitting

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is Bias Error in the context of Machine Learning?

A

Error produced by underfitting

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is Variance Error in Machine Learning?

A

Error produced by overfitting

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What do probabilistic models in Machine Learning output?

A

A probability of success/fail instead of a simple yes or no

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How to do testing on Machine Learning?

A

To build a model using half data (training) and test it on the other half

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is K-fold validation?

A

Divides data into k subsets, iterating through all where one is testing and the rest training

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the Curse Of Dimensionality?

A

Need an adequate number of samples to make a good model; sample size increases exponentially with dimension size increase

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is Dimensionality Reduction?

A

Field trying to recast high dimension spaces into lower ones

17
Q

What does Clustering in Unsupervised Learning aim to do? Why is clustering an ill posed problem

A

Builds model from set of unlabeled data
Describe/generate data based on similarities
Clustering: ill posed problem - male / female / penguin problem

18
Q

What is K-means in clustering?

A

A method to find ‘k’ means of clusters in data through optimization. If you want clusters to be same size consider using weights on distance

19
Q

What are the steps involved in K-means clustering?

A

Looking to find “k” means of clusters in data.. Optimization

  • Start with ‘n’ samples ‘X’ and collection of K-means ‘M’. m in M are random x values
  • cycle through next 2 steps until means dont change
  • Color each sample x in ‘X’ according to its closest mean ‘m’ in ‘M’
  • Re-average ‘m’ and move it to its new location
20
Q

What is Expected Maximization (EM) in clustering? What algo is part of this. What is its main problem?

A

Labels data to current model predictions and modifies model to match distributions of labels
K-means is part of this
Getting stuck in local optima

21
Q

What is K-nearest-neighbor? Why type of machine learning algo is it? Steps?

A

A classification algorithm that stores all samples <points, labels>. Supervised Learning
(1) Store all samples <point,label>
(2) When queried with new point, find “k” points closest to it. Points then vote using label values.

22
Q

How does K-nearest-neighbor classify a new point?

A

Finds ‘k’ points closest to it and those points vote using label values

23
Q

What is the effect of a low ‘k’ in K-nearest-neighbor?

A

Affected by noise

24
Q

What is the effect of a large ‘k’ in K-nearest-neighbor?

A

Means areas with few samples get corrupted

25
Q

What is the purpose of a quad tree in K-nearest-neighbor?

A

To increase efficiency by reducing big O

26
Q

True or False: Reinforcement Learning uses data points for models.

27
Q

Types of machine learning algos

A

Supervised Learning - <i,o>, Classification, regression
Unsupervised Learning <i>
Semi-supervised Learning
Reinforcement Learning π(s) -> a</i>