Module 12: Naive Bayes and Perceptrons Flashcards

1
Q

T/F
The Naive Bayes model is β€œnaive” because it assumes that the features are conditionally independent of each other, given the class.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Write down the form of the joint probability model 𝑃(𝑋1,𝑋2,𝑋3,π‘Œ)
for this data using the Naive Bayes assumption. (Y is the class)

A

𝑃(π‘Œ)𝑃(𝑋1|π‘Œ)𝑃(𝑋2|π‘Œ)𝑃(𝑋3|π‘Œ)

In naive bayes, the features are conditionally independent of each other, given the class.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

T/F
In the perceptron training algorithm, weights are updated after every training instance.

A

False

Training is error-driven - weights are only updated when the predicted label is incorrect.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Consider a two-dimensional data distribution where points belonging to class A are arranged around the origin in a circle with radius π‘Ÿπ‘Ž
, and points belonging to class B are arranged around the origin in a circle with radius π‘Ÿπ‘
, where π‘Ÿπ‘Ž<π‘Ÿπ‘

When applied to this classification problem, the perceptron algorithm

A

can separate the two classes if we use feature augmentation.

If we augment the input with the features π‘₯2,𝑦2
, and a bias term, the problem becomes linearly separable. Decaying the learning rate will ensure that the weight vector converges, but since the unaugmented data is not linearly separable, this will still fail to separate the classes. Changing the order in which the instances are fed into the perceptron will not allow the perceptron to find a separating boundary.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

For which of the following datasets would a single perceptron be unable to find a separating boundary (assuming we do not add additional features to the data):

A

Class A: {(0,0), (1,1)}; Class B: {(0,1), (1,0)}

The correct answer corresponds to the XOR problem, which is not linearly separable. All other options are linearly separable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

T/F
Given a non-random dataset, the perceptron algorithm guarantees that the weight vector will converge after a finite number of training iterations.

A

False

This is only true if the dataset is linearly separable. If it is not, the weight vector may fluctuate infinitely. The fluctuation can be remedied by techniques such as learning rate decay, but the basic perceptron algorithm is not guaranteed to converge.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

The inference procedure for the perceptron classification algorithm is best summarized as:

A

Compute a weighted sum, then apply a threshold.

First, we compute the dot product of the weight vector and the input vector (that is, a weighted sum of the input features), and then threshold this value to output either true or false. If you picked the β€œdatum separatus” option, you should consider transferring to Hogwarts.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly