Decision Tree MLM Flashcards
Decision Trees
Decision Trees are a type of Supervised Machine Learning where the data is continuously split according to a certain parameter.
- Introduction
Decision Trees are a type of predictive modeling approach used in statistics, data mining, and machine learning. They are simple to understand and interpret and are often used for classification and regression tasks.
- Tree Structure
A decision tree consists of nodes that form a rooted tree, meaning it is a directed tree with a node called the “root” that has no incoming edges. All other nodes have one (and only one) incoming edge. Nodes having outgoing edges are known as internal nodes. All other nodes are leaves.
- Building a Decision Tree
The process of training a decision tree and predicting the target features is as follows: - Begin the tree with the root, which asks the most important feature question. - For every internal node, the tree considers the attribute value and goes left or right. - Continue this process until a leaf node is reached, which provides the prediction of the target value.
- Splitting Criteria
Decision trees use various metrics for deciding the attribute on which to split the data at each step. This could be Gini impurity, information gain, or variance reduction.
- Pruning
Pruning is a technique in machine learning and search algorithms that reduces the size of decision trees by removing sections of the tree that provide little power to classify instances. This reduces complexity and avoids overfitting.
- Strengths
Decision trees are simple to understand and visualize, can handle both numerical and categorical data, and the cost of using the tree (i.e., predicting data) is logarithmic in the number of data points used to train the tree.
- Limitations
Decision trees can create overly complex trees that do not generalize the data well, can be unstable because small variations in the data might result in a completely different tree being generated, and learning an optimal decision tree is known to be NP-Complete as it can get stuck in local minima.
- Applications
Decision trees are used in a variety of fields, including medical diagnosis, credit scoring, and many areas of machine learning and artificial intelligence (AI).