Lecture 3 - Multi-class classification and regression Flashcards

Question 1

Q

Give an example of multi-class classification

Answer

A

Disease type diagnosis
topic classification

Question 2

Q

what are the two approaches to turn binary classifier into multi-class

Answer

A

One versus rest
One versus one

Question 3

Q

What is one versus rest

Answer

A

Each classifier distinguishes between one specific class and all other classes combined. The class with the highest confidence score from its respective classifier is chosen as the final prediction.

Question 4

Q

what is the process of learning and inference one-versus-rest

Answer

A

Learning: traink or k-1 seperate classifiers where k is the number of classes
Inference: use all the and form a code word based on the output of the classidier. Next compare the code word against all the rows and dinf the cnearest row in the code matrix.

Question 5

Q

what is one-versus-one

Answer

A

strategy for multi-class classification in machine learning where a separate binary classifier is trained for every possible pair of classes.

Question 6

Q

For n classes, this results is n(n−1)/2 classifiers for symmetric, and n(n-1) for asymetric.

who does this apply to?

Answer

A

one versus one

Question 7

Q

what is the process of training and inference of one-versus-one

Answer

A

Training: traiin seperate classifiers for each pair of classes
Inference: use all the classifications for a code word based on the output of the classifier. Next, compare the code word against all rows and find the nearest row in the code matrix. Take a voting scheme when distances are not unique.

Question 8

Q

How to get the accuracy in a confuzion matrix of a three-class confuzion matrix

Answer

A

add up all the True Positives divided by the total

Question 9

Q

How to get the precision in a confuzion matrix of a three-class confuzion matrix

Answer

A

get the precision for each class, then multiply it by the distribution with the total, then add all the three weighted precisions together.

Question 10

Q

how can we know how good a classifier can be?

Answer

A

macro-average
micro-average

Question 11

Q

what is macro-average

Answer

A

macro-averate will compute the metric independently for each class and then take the average.

Question 12

Q

what is micro-average

Answer

A

micro-average will aggregate the contributions of all classes to compute the average metric.

Question 13

Q

how should AUC curves be used for multi-class classifiers

Answer

A

The average AUC over binary classification tasks, eigher in a one-versus-rest of one-versus-one.

Question 14

Q

what does ROC stand for

Answer

A

Receiver operating characteristics

Question 15

Q

What is regression loss

Answer

A

Regression models are evaluated by applying a loss function to the residuals. f(x)- ^f(x)

Question 16

Q

how many parameters does n-degree polynomial have?

Answer

A

n+1 parameters

Question 17

Q

how to avoid overfitting in regression

Answer

A

To avoid overfitting, the number of parameters estimated from the data must be considerably less than the number of data points.

Question 18

Q

what is the Bias-variance dilemma

Answer

A

A low-complecity models suffers less from variability due to random variations in the training data, but mey introduce a systematic bias that even large amounts of training data can’t resolve; on the other hand, a high-complexity model eliminates such bias but can suffer non-systematic errors due to variance.

Question 19

Q

In ____ learning the task if to come up with a description of the data

Answer

A

descriptive

Question 20

Q

what is distance based clustering

Answer

A

Most distance-based clustering methods depend on the possibility of defining a ‘centre of mass’ or exemplar of an arbitrary set of instances, such that the exemplar minimises some distance-related quantity over all instances in the set, called its scatter. A good clustering is then one whereby the scatter is summed over each cluster - the witithin-cluster scatter is much smaller than the scatter of the entire data set.

Question 21

Q

What is purity

Answer

A

check slides or gpt
1/N sigma(max(omega_k union c_j))

Question 22

Q

What are the three ways to evaluate clustering performance without ground truth?

Answer

A

Calinski-Harabsz Index
Davies-Bouldin Index
Silhouette Coefficient

Question 23

Q

what is the silhouette coef formula

Answer

A

s=(b-a)/max(a,b)

a: the mean distance between an instance and all other points in the same cluster
b: the mean distance between an instance and all other points in the next cluster.

Question 24

Q

what is the range of silhouette coef results

Answer

A

-1 to +1. +1 far from neighbours, 0 on the decision line, -1might be assignmed to the wrong cluster.

Question 25

Q

Give 2 examples of subgroup discovery

Answer

A

Detection of risk groups with coronary heart disease or cancer
Finding patterns in traffic accidents

Question 26

Q

how can we assess performance of sub-group discovery

Answer

A

Chi squared test
class distribution on the left different from class distribution in the row marginals