Topic 2: Classification Flashcards

1
Q

What does the Naïve Bayes classifier assume about input features?

A

It assumes that input features are independent given the class label.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the purpose of the m-estimate in Naïve Bayes classifiers?

A

To estimate posterior probabilities, accounting for prior probability and confidence in data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How is the ROC curve used in binary classification?

A

It plots the true positive rate versus the false positive rate as the discrimination threshold is varied.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the difference between regression and classification problems?

A

Regression deals with continuous output values, while classification deals with discrete classes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Name a simple probabilistic classifier studied in classification tasks.

A

Naïve Bayes classifier.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the purpose of splitting a dataset into training, validation, and test sets?

A

To train, select, and evaluate the performance of a model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does the area under the ROC curve (AUC) indicate?

A

The trade-off between true positive and false positive rates; a higher AUC indicates better classifier performance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is classification?

A

A supervised learning task where the goal is to assign labels to data points based on input features.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the difference between binary and multiclass classification?

A

Binary classification involves two possible outcomes (e.g., spam vs. not spam).
Multiclass classification involves three or more possible outcomes (e.g., dog, cat, rabbit).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Name three common algorithms for classification.

A

Decision trees, logistic regression, and support vector machines (SVM).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a classifier?

A

A model that has been trained on labeled data to predict labels for new data points.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is classification?

A

A supervised learning task where the goal is to assign labels to data points based on input features.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the difference between binary and multiclass classification?

A

Binary classification involves two possible outcomes (e.g., spam vs. not spam).
Multiclass classification involves three or more possible outcomes (e.g., dog, cat, rabbit).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Name three common algorithms for classification.

A

Decision trees, logistic regression, and support vector machines (SVM).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a classifier?

A

A model that has been trained on labeled data to predict labels for new data points.

17
Q

What is accuracy?

A

The ratio of correctly predicted observations to the total observations: TP+TN / TP+TN+FP+FN

18
Q

Why might accuracy not be enough in some cases?

A

It can be misleading when dealing with imbalanced datasets where one class dominates.

19
Q

What is precision?

A

The ratio of true positives to all predicted positives:
TP / TP+FP

20
Q

What is recall?

A

TP / TP+FN

21
Q

What is fallout?

A

FP / FP+TN

22
Q

What is F-measure?

A

Also called F1 score. A harmonic mean of precision and recall, useful for evaluating models when there’s an imbalance between classes:
2PrecisionRecall/Precision+Recall

23
Q

What is a confusion matrix?

A

A table that shows the performance of a classification algorithm by comparing actual vs. predicted classes.

24
Q

What is a support vector machine (SVM)?

A

A model that finds the hyperplane that best separates data points from different classes.

25
Q

What is Naive Bayes classification?

A

A probabilistic classifier based on Bayes’ theorem, assuming features are conditionally independent.

26
Q

How can you handle class imbalance?

A

Techniques include oversampling the minority class, undersampling the majority class, or using weighted loss functions.