Chapter 1-2 What is Imbalanced Classification Flashcards

1
Q

What’s the difference between unbalanced and imbalanced data? P 21

A

Unbalanced refers to a class distribution that was balanced and is now no longer balanced, whereas imbalanced refers to a class distribution that is inherently not balanced.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How can we generate an artificial dataset for classification problems? P 27

A
The make_blobs() function can be used to generate a specified number of examples from a test classification problem with a specified number of classes.
X, y = make_blobs(n_samples=1000, centers=2, n_features=2, random_state=1, cluster_std=3)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Why should we make sure the minority class is denoted as 1 and the majority is denoted as 0? P 33

A

Note that when working with binary classification problems, especially imbalanced problems, it is important that the majority class is assigned to class 0 and the minority class is assigned to class 1. This is because many evaluation metrics will assume this relationship.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly