Trees Analysis Flashcards
A benefit of a TREE analysis is its visual outcome.
TRUE. This is one of the main reasons of TREES being so widely used.
The benefit of TREES is much clearer when there is not a single causation model for the whole population of interest
TRUE. A TREE can handle complexity in causation in a flexible way.
OVERFITTING is about the risk of a poor “generalization” of our results in new samples
TRUE. This is a way of saying that, if we let the algorithm to “overfit“ we take the risk of getting good results in the training sample and not that good in new samples.
If you increase the minimum number of cases in a parent node to be divided or in a child node, you take a higher risk of overfitting
FALSE. By increasing the minimum of cases needed, we avoid splitting small nodes into even smaller nodes so we limit the flexibility of the tree looking for a better generalization of results
The GAIN of a node in a tree measures the % of “hits” in the node compared to % of “hits” in the whole sample
TRUE. This is the exact definition
CHAID, CRT, TWO-STEP and PCA are all different types of TREES algorithms
FALSE. TWO-STEP is a clustering algorithm and PCA is about factor analysis.
In a CREDIT DEFAULT RISK exercise, a false positive is much more expensive than a false negative
FALSE. A false positive means to reject a credit but a false negative means to give credit to a risky customer
If we predict a YES/NO target using a TREE, the main output we will get in terms of prediction is directly a YES/NO
FALSE. When we use a TREE to predict a YES (Vs NO) target, we will get a “propension” to “YES”, a kind of numerical score that we will later transform into a YES/NO according to a give threshold. Modeler is able to generate this YES/NO automatically (apart from the score) but we have to control the cutoff value as analysists.
An Chi – Squared is used by CHAID tree algorithm to select the best predictors
TRUE. The name itself means Chi-Suqared Automated Interaction Detection. In effect, it uses Chi-Square TEST in order to select the best predictor at each split
TREES can be used to predict a CATEGORICAL variable with more than two categories
TRUE, there is no restriction in the number of categories
We normally have to hold out a part of our dataset / sample to validate our TREES
TRUE. We try to avoid the algorithm to “memorize” the sample of analysis (OVERFITTING) so we test the accuracy of our TREE in a holdout sample (not used to train the model)
We should put a lot of work in pre- selecting the main predictors (as explanatory candidate variables) before launching a TREE analysis.
FALSE. A TREE algorithm has the ability of selecting the best predictors among a long list of candidates. This is, in fact, one of the advantages of this type of algorithm.
TREES are part of the algorithm’s family called “RULE INDUCTION models”
TRUE. That name comes from the idea that these models derive a set of rules that describe distinct segments within the data in relation to the target. The model’s output shows the reasoning for each rule and can therefore be used to understand the decision-making process that drives a particular outcome.
TREES is a kind of “classification” analysis
TRUE. It is used to predict CATEGORICAL targets.
CRT is a bit different to other TREES algorithms because it can be used to predict SCALE targets.
TRUE. In fact, the name CRT comes from Classification & Regression TREE. The word “regression” (Vs Classification) means that it can be used to predict scale variables
TREES analysis can also be understood as a type of CLUSTER algorithm because, at the end, it could be used to find similar groups according to the values of a set of variables.
FALSE. A TREE does not find similar groups according to a set of variables. The segments identified by a TREE (NODES) consist in groups of customers that have similar propension in relation to A TARGET VARIABLE. In this sense, the groups are CONDITIONED to the target variable; we use this target variable as SUPERVISOR for the result.