Lecture 4: Decision trees & k-means Flashcards

Question 1

Q

What is logical inference?

Answer

A

Process of deriving new facts from a set of premises

Question 2

Q

What are the 3 types of logical inference?

Answer

A

1.Deduction
2.Abduction
3.Induction

Question 3

Q

What is deduction?

Answer

A

-Conclusion follows necessary from the premises

Question 4

Q

What is abduction?

Answer

A

-Conclusion is one hypothetical(most probable) explanation for the premises

Question 5

Q

What is induction?

Answer

A

-Conclusion about all members of a class from the examination of only a few member of the class.

Question 6

Q

What is inductive learning?

Answer

A

-Most work in ML
-Examples are given to train a system in a classification task

Question 7

Q

What are 3 examples of techniques in ML?

Answer

A

-Probabilistic methods
-Decision trees
-Neural networks

Question 8

Q

What are decision trees(3)?

Answer

A

-Simple, but very successful form of learning algo
-Very well-know algo is ID3 and its successor C4.5

Question 9

Q

What is the Ockham’s razor principle?

Answer

A

-Always favor the simplest answer that correctly fits the training data (the smallest tree on average)

Question 10

Q

What type of assumption is Ockham’s razor principle based on?

Answer

A

Inductive bias : making a choice beyond what the training instances contain

Question 11

Q

What is maximum information-gain and what uses it?

Answer

A

-Choose the attribute that has the largest information gain.
-ID3 uses it

Question 12

Q

What is essential information theory?

Answer

A

-Shannon in 1940s
-Notion of entropy(information content)
-Measure how predictable a random variable(RV) is if:
i. if you already have a good idea about the answer–> low entropy
ii. if you have no idea about the answer–> high entropy

Question 13

Q

What are the 2 things to watch out for in all types of learning?

Answer

A

-Noisy input
-Overfitting/underfitting the training data

Question 14

Q

What are 3 cases of noisy input?

Answer

A

-Two examples have the same feature-value pairs, but different outputs
-Some values of features are incorrect or missing
-Some relevant attributes are not taken into account in the data set

Question 15

Q

What are 3 cases of overfitting?

Answer

A

-Complicated boundaries overfit the data
-Large number of irrelevant features
-They do not generalize well to the new data
*Training error is high
*Testing error is high

Question 16

Q

What is underfitting(2)?

Answer

Study These Flashcards

A

-Model is not expressive enough–> not enough features
-There is no way to fit a linear decision boundary so that the training examples are well separated
*Training error is high
*Testing error is high