Supervised Learning Flashcards

Includes the topics: Input representation, hypothesis class, version space, VC Dimension, PAC, Noise, Learning multiple classes, model selection and generalization.

1
Q

What do you mean by input representation for the problem?

A

A real-world problem can have large number of input features. But not all these features are always important or relevant.

Only those features which are significant are needed to be considered for assigning the class labels. These “Input Features” constitute an “Input Representation” for the given problem.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a hypothesis?

A

It is a statement or a proposition that explains the given set of facts or observations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a hypothesis space?

A

It is the set of hypotheses for a problem.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What do you mean by consistency?

A

A hypothesis is said to be consistent if h(x)=c(x).
Where h(x) is the hypothesis function and
c(x) is the class labels.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is version Space?

A

It is a set of hypotheses, consistent with the set of training examples.

The version space is present between the most general and most specific hypotheses.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

When we can say that a hypothesis is consistent?

A

A hypothesis is consistent if it correctly classifies all training examples.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What do you mean by the Most General Hypothesis(G)?

A

A hypothesis is said to be the most general hypothesis if it covers none of the negative examples and there is no other hypothesis ‘ h’ ‘ that covers no negative examples such that ‘ h’ ‘ is more general than h.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What do you mean by the Most Specific Hypothesis(S)?

A

A hypothesis is said to be the most specific hypothesis if it covers no negative examples and there is no other hypothesis ‘ h’ ‘ that covers no negative examples such that ‘ h’ ‘ is more specific than h.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is noise?

A

Noises are the unwanted anomaly in the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How noise arises?

A

The factors affecting the creation of noise are:
1. Imprecision in recording the input attributes, which may shift the data points in the input space.
2. Errors in labelling the data points.
3. Neglecting attributes that are relevant to the prediction of labels.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the effects of noise?

A
  1. Noise disotrs data
  2. Leads to wrong prediction.
  3. Reduces the accuracy of the model.
  4. An increase in the complexity of the induced classifier.
  5. An increase in training time.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Which are the methods used in learning multiple classes?

A
  1. One-against-all
  2. One-against-one
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Explain One-vs-all approach?

A

In this approach, the n number of classification models are trained in parallel with the n number of the output classes by considering that there is always a separation between the actual class and the remaining classes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Explain One-vs-one approach?

A

One-vs-one is an alternative approach to One-vs-all. This means training a machine learning model for each pair of classes. The time complexity of this approach is therefore not linear and the right class is determined by the majority class. In general, One-vs-one is more expensive than One-vs-all and it should only be adopted when a comparison of the complete data set is not preferred.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What do you mean by model selection?

A

It is the process of selecting a model for a problem. It may include selecting appropriate algorithms, choosing the set of input features, or choosing the initial values for certain parameters. It has also been described as the process of selecting the right inductive bias.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is Inductive Bias?

A

The set of assumptions we make to have learning possible is called the inductive bias of the learning problem.
One way we introduce inductive bias is when we assume a hypothesis class.

17
Q

What are the advantages in using a simple model?

A
  1. Easy to use.
  2. Easy to train.
  3. Easy to explain.
  4. Easy to generalize.(Occam’s razor)
18
Q

What is Occam’s razor?

A

It states that simpler explanations are more plausible and any unnecessary complexity would be shaved off.

19
Q

What do you mean by Generalization?

A

It refers to how well the model is trained on the training set and predicts the right output for new instances is called generalization.
It refers to how well the concepts are learned by a machine learning model.
A generalized model should be capable of reducing the occurrence of underfitting and overfitting.

20
Q

What is underfitting?

A

It is a scenario where the model is unable to capture the relationship between the input and output variables to give accurate results.

21
Q

What is overfitting?

A

It occurs when a function is too closely aligned to a limited set of data points.

22
Q

What is cross-validation?

A

The training data is divided into two parts: the training set and the validation set.
The hypothesis that is most accurate on the validation set is considered the best one. This process is called cross-validation.

23
Q

What is VC Dimension?

A

It is the measure of the capacity of space functions (like complexity, flexibility, etc.)that can be learned by a classification algorithm.

It is the power of hypothesis space.

The maximum number of data points that can be shattered by the hypothesis space is known as the VC dimension.

24
Q

What do you mean by shattering?

A

Correctly classifying all labeling combinations of the dataset.

Shattering is the ability of a model to classify a set of points perfectly.

The model can create a function that can divide the points into two distinct classes without overlapping.

25
Q

What is dichotomy?

A

Dichotomies refer to the division of data into two categories or classes.

N points -> 2^N dichotomies

26
Q

Can a line hypothesis shatter 2 data points?

A

yes (include the graphical representation).

27
Q

Can a line hypothesis shatter 3 data points?

A

yes (include the graphical representation).

28
Q

Can a line hypothesis shatter 4 data points?

A

A line hypothesis cannot shatter 4 data points.

29
Q

How can you shatter 4 data points?

A

We can shatter 4 data points by using an axis-paralleled rectangle/axis-aligned rectangle. (include the graphical representation).

30
Q

What is Probably Approximately Correct Learning?

A

Probably approximately correct (PAC) learning is a framework for the mathematical analysis of machine learning.

In this framework, the learner receives samples and must select a generalization function (called the hypothesis) from a certain class of possible functions. The goal is that, with high probability (the “probably” part), the selected function will have low generalization error (the “approximately correct” part).