Lesson 2 Flashcards
What is the exhastive search?
Decision trees
Trying every possible way to split the data set and constructing every possible tree
For every feature the tree grows exponentially
What does a high impurity mean?
This means that a node has a more equal distribution
How can you calculate the depth of a decision tree?
It is the maximum amount of questions you can ask to get an answer
What are 7 advantages of Decision trees
- easy to understand
- can be visualised
- handles both numerical and categorical data
- requires little data preparation
- captures non-linear relationships
- feature importance
- fast to train and predict
What are 4 drawbacks of decision trees
- Prone to overfitting
- Sensitive to minor details
- Unstable
- Biased toward features with more levels
What is ensemble learning?
When ‘weak learner’ models are combined either sequential or parallel.
“The wisdom of the crowd”
What is Bagging?
Decision trees
Using different training sets to find the best possible decision tree
Parallel ensemble learning technique
What is Random Forest?
Making random trees with the same training data and finding the answer by using the most chosen over all of the DTs