Classification Algorithms Flashcards

1
Q

What are the steps of generating a decision tree classification?

A
  1. Calculate entropy of root node (or whole dataset if root not exists)
  2. Calculate entropy of all other attributes
  3. Calculate information gain for each attribute
  4. Pick lowest information gain to use as your ‘split’
  5. Traverse each decision. If dataset is sorted, close. Otherwise, repeat.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the entropy equation?

A

The sum of -pi log(pi) for all classes in that attribute, where pi is the probability that the class appears over each other.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the information gain equation?

A

Entropy of the current node minus the sum of the entropy of each class of the chosen attribute multiplied by the probability that class will appear given t.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the steps of generating a K-nearest neighbours classification?

A
  1. Calculate the distance of each node in respect to the new node.
  2. Find its K nearest neighbours, where K is a user input.
  3. Find the most common class among those neighbours and select it.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is different between K-nearest neighbours classification and regression?

A

A K-nearest neighbours regression takes the average of the neighbours’ numerical values. Sometimes average isn’t used, instead opting for max. Either way, we use some measure to find a calculation between the points.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly