Classification Algorithms Flashcards

Question 1

Q

What are the steps of generating a decision tree classification?

Answer

A

Calculate entropy of root node (or whole dataset if root not exists)
Calculate entropy of all other attributes
Calculate information gain for each attribute
Pick lowest information gain to use as your ‘split’
Traverse each decision. If dataset is sorted, close. Otherwise, repeat.

Question 2

Q

What is the entropy equation?

Answer

A

The sum of -pi log(pi) for all classes in that attribute, where pi is the probability that the class appears over each other.

Question 3

Q

What is the information gain equation?

Answer

A

Entropy of the current node minus the sum of the entropy of each class of the chosen attribute multiplied by the probability that class will appear given t.

Question 4

Q

What are the steps of generating a K-nearest neighbours classification?

Answer

A

Calculate the distance of each node in respect to the new node.
Find its K nearest neighbours, where K is a user input.
Find the most common class among those neighbours and select it.

Question 5

Q

What is different between K-nearest neighbours classification and regression?

Answer

A

A K-nearest neighbours regression takes the average of the neighbours’ numerical values. Sometimes average isn’t used, instead opting for max. Either way, we use some measure to find a calculation between the points.

Classification Algorithms Flashcards

(5 cards)