Decision Trees Flashcards
What is a Decision Tree?
Decision Tree is a tree shaped diagram used to
determine a course of action.
What does each branch of the Decision Tree typically represent?
Each branch of the tree represents a possible decision,
occurrence, or reaction.
What type of learning algorithm is a Decision Tree, and for what tasks is it commonly used?
A decision tree is a non-parametric
supervised learning algorithm, which is utilized for
both classification and regression tasks.
How is the training dataset used in building models with the Decision Tree Algorithm?
The training dataset is induced and then used the learning model phase by tree induction algorithm. The learned model is an outcome of the tree induction algorithm processing the training set.
In terms of tasks, what are the two main applications of a decision tree?
classification and regression tasks.
Why is a decision tree considered a popular data mining technique?
A decision tree visualization helps outline the decisions in a way that is easy to understand
What is the primary goal of creating a model using the Decision Tree Algorithm?
The goal is to create a model that predicts the value
of a target variable based on several input variables.
How is an internal node represented in a decision tree, and what does it signify?
an internal node represents a feature (or attribute)
What does each branch in a decision tree represent?
represents a decision rule
What role does a leaf node play in a decision tree, and what does it represent?
Each leaf node represents the outcome.
What is the significance of the root node in a decision tree?
It learns to partition on the basis of the
attribute value
How does the root node contribute to the partitioning of a decision tree?
It partitions the tree in a recursive manner called recursive partitioning
What is the process known as when the decision tree partitions in a recursive manner?
recursive partitioning
Why is a decision tree often compared to a flowchart diagram?
It’s visualization like a flowchart diagram which easily mimics the human level thinking.
What are the three main components of a decision tree, and what do they represent?
- Node
test for the value of a certain attribute - Edges
correspond to the outcome of a test - Leaves
terminal nodes that predict the outcome
In a classification tree, what is the purpose of determining a set of logical if-then conditions?
To classify problems.
When is a regression tree used, and how does it differ from a classification tree in terms of the target variable?
A regression tree is used when the target variable is numerical or continuous. We fit a regression model
to the target variable using each of the independent variables
In the context of Decision trees:
Define Gain
Gain is a measure of decrease in entropy after splitting
Decision Trees
How to split the data?
We hace to frame the conditions that split the data in such a way that the information gain is highest
How does the decision tree algortithm work?
- Select the best attribute using Attribute selection measures to split Records
- Make that attribute a decision node
- Start Tree building by recursively reapting this process for each child
Generally, when does the process of Tree building stop?
- When there are no more attributes
- There are no more instances
- All the tuples belong to the same attribute
List three advantages of Decision Trees in machine learning.
Simple to understand, interpret and visualize
What makes Decision Trees simple to understand and interpret for humans?
They look like simple if-else statements, and therefore can
be easily interpreted by humans
What are the advantages of Decision Trees regarding data preparation?
- No Scaling needed
- Can work without extensive handling of missing data
- no need for dummy variable
In terms of handling variables, what types of variables can Decision Trees manage?
Can handle both categorical and numerical
variables
Do nonlinear parameters affect Decision Trees?
Nonlinear parameters don’t affect its performance
What is a notable characteristic of Decision Trees regarding assumptions compared to statistical models?
Do not require the assumptions of statistical models
Identify the major disadvantage of Decision Trees in machine learning.
overfitting
What is overfitting, and how does it impact the performance of a decision tree?
Overfitting can lead to wrong decisions. A decision tree will keep generating new nodes to fit the data. This makes it loses its generalization capabilities.
Why does a decision tree lose its generalization capabilities due to overfitting?
A decision tree will keep generating new nodes to fit the data
What happens to the overall tree when new data points are added?
Leads to the regeneration of the overall tree meaning that nodes need to be recalculated.
How is noise a factor that affects the stability of a decision tree model?
a little bit of noise can make a decision tree model unstable
Why are Decision Trees considered unsuitable for large datasets, and what issue does it lead to?
A large dataset can cause the tree to grow too large and
complex, which will lead to overfitting.
What is a low-biased Tree?
It is a highly complicated tree that has low bias which makes it hard for the model to work on new data
What can a high variance do to a decision tree?
The model can get unstable
What is the challenge associated with large decision trees
They become difficult to interpret
How are classification rules extracted from a decision tree?
One rule is created for each path from the root to a leaf node.
Each splitting criterion along a given path is logically
joined by AND operator to form the “IF” part. The leaf
node holds the class prediction, forming the rule
“THEN” part.
If age = youth AND student = no then buys_computer = no
What is the process of forming the “IF” part of a classification rule from a decision tree path?
Each splitting criterion along a given path is logically
joined by AND operator to form the “IF” part.
What information does the leaf node hold in the context of forming a classification rule?
The leaf node holds the class prediction, forming the rule “THEN” part.
What does the root node represent in a decision tree, and what kind of edges are associated with it?
can be considered as the starting point of the tree
where there are no incoming edges but zero or more
outgoing edges. The outgoing edges lead to either an
internal node or a leaf node.
The root node is usually an attribute of the decision tree model
How is an internal node defined in a decision tree, and what is its relationship with outgoing edges?
Appears after a root node or an internal node and is
followed by either internal nodes or leaf nodes. It has
only one incoming edge and at least two outgoing
edges.
Internal nodes are always attributes of the decision tree model
What characterizes leaf nodes in a decision tree, and what information do they typically represent?
These are the bottommost elements of the tree and
normally represent classes of the decision tree model.
Depending on the situation, if it can be classified, each leaf node can have only one class label or sometimes a class distribution.
How is the class distribution handled in leaf nodes, and what determines the number of outgoing edges from a leaf node?
Depending on the situation, if it can be classified, each
leaf node can have only one class label or sometimes a
class distribution.
Leaf nodes have one incoming edge and no outgoing
edges
Name the Following:
A decsion Tree is created in two phases
- Recursive partitioning
- Pruning the tree
What is the idea of Recursive partitioning?
Repeatedly split the records into two or more branches, so as to achieve maximum homogeneity/purity within the new parts
What is the idea of pruning the tree?
Simplify the tree by pruning peripheral branches to avoid overfitting
How does the concept of purity relate to the subsets created by a good attribute split?
a good attribute splits the examples into subsets
that are (ideally) “all positive” or “all negative”
When dealing with numerical variables, how is the splitting process performed in decision trees?
- Order records according to the numerical variable
- Find midpoints between successive non- duplicate values
- Divide records into those with x> midpoint and those < midpoint
E.g.
for the three points 14, 14.8, 16, the midpoint between 14.0
and 14.8 is 14.4, and the midpoint between 14.8 and 16 is 15.4.
records with lot_size> 14.4 and those lot_size< 14.4)
After evaluating that split, try the next split which is 15.4
Explain the process of finding midpoints between successive non-duplicate values.
taking the average of the two values. For example, for the three points 14, 14.8, 16, the midpoint between 14.0 and 14.8 is 14.4, and the midpoint between 14.8 and 16 is 15.4.
How are records divided based on the midpoints and the numerical variable in decision trees?
Divide the records into two groups based on whether they are greater than or less than the midpoint. For example, records with lot_size > 14.4 and those with lot_size < 14.4.
How do decision trees search for the best division of the input space?
Decision Trees greedily search for the best division of the
Input Space into exhaustive, mutually exclusive pure rectangles.
When dealing with categorical variables, how are all possible ways of splitting the categories examined?
there are 𝟐^(𝒏−𝟏) -1 possible binary splits.
E.g., categories A, B, C can be split 3 ways
{A} and {B, C}
{B} and {A, C}
{C} and {A, B}