SUL Topic 5a - Decision Tree Flashcards
Decision Tree
Supervised learning algorithm used for classification problems
Advantages of Decision Trees
Easy interpretation
Handles numerical and categorical data
Disadvantages of Decision Trees
Prone to overfitting
Unstable with small data variations
Applications of Decision Trees
Predicting insurance premiums
Home prices
Types of Decision Trees
Regression trees (continuous variables)
Classification trees (categorical variables)
Handling Unseen Data in Regression Trees
Use mean or median
Handling Unseen Data in Classification Trees
Use mode
Pruning in Decision Trees
Essential technique to prevent overfitting
Popular Decision Tree Algorithms
ID3
Gini index
Why are Decision Trees popular for beginners?
Easy to understand and interpret
Enhancing Decision Tree Accuracy
Integrating with advanced algorithms like Random Forest
Implementation of Decision Trees
Simplified by Python libraries like scikit-learn
Real-World Applications of Decision Trees
Predicting survival rates
Pricing
Types of decision tree algorithm
C5.0
CHAID
Stands for Chi-squared Automatic Interaction Detection
QUEST
Stands for Quick, Unbiased and Efficient Statistical Tree
Strengths of Decision Trees
Generate understandable rules
Handle continuous and categorical data
Weaknesses of Decision Trees
Less appropriate for continuous value prediction
Prone to errors with many classes
Computational Aspect of Decision Trees
Can be expensive to train
Important Fields in Decision Trees
Provide clear indication of which are most important for prediction or classification