Lecture 4 Flashcards

1
Q

What is the difference between classification and regression tasks?

A

Classification is when the output is one of a finite set of values. Regression is when the output is a measured integer or real number.

  • Classification = sunny/cloudy/rainy or true/false
  • Regression = tomorrow’s temperature
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do we choose a GOOD hypothesis space?

A

Choose the one that fits your model the best.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is induction?

A

Going from a specific set of observations to a general rule. We assume that we can apply our model to future cases (e.g., image recognition). NOTE: Inductive conclusions can be incorrect.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a deductive conclusion?

A

Conclusions that are guaranteed to be correct if the premises are correct.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the 3 types of Learning?

A
  • Supervised Learning
  • Unsupervised Learning
  • Reinforcement Learning
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do we choose a hypothesis space?

A

If you don’t have some prior knowledge about the process that generated the data, you perform exploratory data analysis to determine which hypothesis space is appropriate. (Or use trial and error.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do we choose a good hypothesis from within the hypothesis space?

A

Look for a best-fit function for which each h(xi) is close to yi, which is the case if h generalizes well with the test set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How can we perform exploratory data analysis?

A

Examine the data with statistical tests and visualizations (such as histograms, scatter plots, box plots, etc.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the true measure of a hypothesis?

A

How well it handles input it has not yet seen (e.g., test set), not how it does on the training set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Define bias.

A

The tendency of a predictive hypothesis to deviate from the expected value when averaged over different training sets.

// or //

A model’s tendency to resist change. High bias == highly resistant to change (e.g. linear model).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Define variance.

A

The model’s magnitude of change.

The amount of change in the hypothesis due to fluctuation in the training data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

When is a hypothesis underfitting?

A

When it fails to find a pattern in the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

When is a hypothesis overfitting?

A

When it performs poorly on unseen data because it pays too much attention to a particular data set it is trained on

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Bias-variance tradeoff

A

A choice between:
1. more complex, low-bias hypotheses that fit the training data well
2. simpler, low-variance hypotheses that may generalize better

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is Ockham’s Razor Principle?

A

Choose the simplest hypothesis that matches the data because there is often a bias-variance tradeoff.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Regarding decision trees, what is the most important attribute?

A

The one that makes the most difference to the classification of an example.

17
Q

What is entropy?

A

A measure of the uncertainty of a random variable. (e.g., a loaded coin that always lands on heads has entropy of 0)

18
Q

What if we don’t have enough data to make all three of the data set splits?

A

You can use k-fold cross-validation.

19
Q

In your own words, explain how the Gradient Descent algo works.

A

There is a for-loop, and for every iteration you will modify your weights by slighty increasing or decreasing them, until you get to a point where it is optimal.

20
Q

Why would we want a machine to learn?

A
  1. The designers cannot anticipate all possible future siutations
  2. Sometimes the designers have no idea how to program a solution
21
Q

Classification learning problem

A

When the output is one of a finite set of values (e.g., boolean, sunny/rainy)

22
Q

Regression learning problem

A

When the output is a number (integer or real number)

23
Q

Supervised Learning

A

The agent observes input-output pairs and learns a function that maps from input to output (aka Labels). There is always an expected label for each processed input.

24
Q

Unsupervised Learning

A

The agent learns patterns in the input without any explicit feedback.

(The expectations are lower than in supervised learning)

25
Q

Reinforcement Learning

A

The agent learns from a series of reinforcements known as rewards and punishments. There are no explicit labels; instead, it has critics.

(e.g., chess game: agent won = reward, agent lost = punishment)

26
Q

What is the Ground Truth?

A

Output yi - the true answer we’re asking our model to predict

27
Q

Why not let H be the class of all computer programs, or all Turing machines?

A
  1. There is a tradeoff between the expressiveness of a hypothesis space and the computational complexity of finding a good hypothesis within that space
  2. Simpler hypothesis spaces should be preferred because we want to use h after it’s learned
28
Q

Decision Boundary

A

A line (or a surface, in higher dimensions) that separates two classes.

(In linear regression, this is a straight line referred to as a linear separator)

29
Q

When is data linearly separable?

A

When it admits a linear decision boundary (linear separator)

30
Q

What is a parametric model?

A

A learning model that summarizes data with a set of parameters of fixed size.

31
Q

What is a non-parametric model?

A

A learning model that cannot be characterized by a bounded set of parameters.

(e.g., instance-based or memory-based learning)