Trees Analysis Flashcards by Ana María Aldana

A benefit of a TREE analysis is its visual outcome.

TRUE. This is one of the main reasons of TREES being so widely used.

How well did you know this?

Not at all

Perfectly

The benefit of TREES is much clearer when there is not a single causation model for the whole population of interest

TRUE. A TREE can handle complexity in causation in a flexible way.

How well did you know this?

Not at all

Perfectly

OVERFITTING is about the risk of a poor “generalization” of our results in new samples

TRUE. This is a way of saying that, if we let the algorithm to “overfit“ we take the risk of getting good results in the training sample and not that good in new samples.

How well did you know this?

Not at all

Perfectly

If you increase the minimum number of cases in a parent node to be divided or in a child node, you take a higher risk of overfitting

FALSE. By increasing the minimum of cases needed, we avoid splitting small nodes into even smaller nodes so we limit the flexibility of the tree looking for a better generalization of results

How well did you know this?

Not at all

Perfectly

The GAIN of a node in a tree measures the % of “hits” in the node compared to % of “hits” in the whole sample

TRUE. This is the exact definition

How well did you know this?

Not at all

Perfectly

CHAID, CRT, TWO-STEP and PCA are all different types of TREES algorithms

FALSE. TWO-STEP is a clustering algorithm and PCA is about factor analysis.

How well did you know this?

Not at all

Perfectly

In a CREDIT DEFAULT RISK exercise, a false positive is much more expensive than a false negative

FALSE. A false positive means to reject a credit but a false negative means to give credit to a risky customer

How well did you know this?

Not at all

Perfectly

If we predict a YES/NO target using a TREE, the main output we will get in terms of prediction is directly a YES/NO

FALSE. When we use a TREE to predict a YES (Vs NO) target, we will get a “propension” to “YES”, a kind of numerical score that we will later transform into a YES/NO according to a give threshold. Modeler is able to generate this YES/NO automatically (apart from the score) but we have to control the cutoff value as analysists.

How well did you know this?

Not at all

Perfectly

An Chi – Squared is used by CHAID tree algorithm to select the best predictors

TRUE. The name itself means Chi-Suqared Automated Interaction Detection. In effect, it uses Chi-Square TEST in order to select the best predictor at each split

How well did you know this?

Not at all

Perfectly

TREES can be used to predict a CATEGORICAL variable with more than two categories

TRUE, there is no restriction in the number of categories

How well did you know this?

Not at all

Perfectly

We normally have to hold out a part of our dataset / sample to validate our TREES

TRUE. We try to avoid the algorithm to “memorize” the sample of analysis (OVERFITTING) so we test the accuracy of our TREE in a holdout sample (not used to train the model)

How well did you know this?

Not at all

Perfectly

We should put a lot of work in pre- selecting the main predictors (as explanatory candidate variables) before launching a TREE analysis.

FALSE. A TREE algorithm has the ability of selecting the best predictors among a long list of candidates. This is, in fact, one of the advantages of this type of algorithm.

How well did you know this?

Not at all

Perfectly

TREES are part of the algorithm’s family called “RULE INDUCTION models”

TRUE. That name comes from the idea that these models derive a set of rules that describe distinct segments within the data in relation to the target. The model’s output shows the reasoning for each rule and can therefore be used to understand the decision-making process that drives a particular outcome.

How well did you know this?

Not at all

Perfectly

TREES is a kind of “classification” analysis

TRUE. It is used to predict CATEGORICAL targets.

How well did you know this?

Not at all

Perfectly

CRT is a bit different to other TREES algorithms because it can be used to predict SCALE targets.

TRUE. In fact, the name CRT comes from Classification & Regression TREE. The word “regression” (Vs Classification) means that it can be used to predict scale variables

How well did you know this?

Not at all

Perfectly

TREES analysis can also be understood as a type of CLUSTER algorithm because, at the end, it could be used to find similar groups according to the values of a set of variables.

FALSE. A TREE does not find similar groups according to a set of variables. The segments identified by a TREE (NODES) consist in groups of customers that have similar propension in relation to A TARGET VARIABLE. In this sense, the groups are CONDITIONED to the target variable; we use this target variable as SUPERVISOR for the result.

How well did you know this?

Not at all

Perfectly

TREES can be called SUPERVISED technique BUT ONLY if we build the TREE in an interactive way, “supervising” the outcome.

Study These Flashcards

FALSE. The tag “Supervised” is because a variable is used as a SUPERVISOR, as a target, for the whole result.

Imagine that we run a FACTOR analysis using a group of satisfaction variables for our customer dataset. Could we use the factor/s score/s as input variables (may be among others) in a TREES analysis to predict “churn”?

Study These Flashcards

TRUE. Why not? A factor score is a metric variable and WE CAN use metric variables as input variables in a TREE. If we have several different satisfaction indicators, a FACTOR would be a good way of introducing our own “satisfaction measure” in our TREE.

In a SPAM EMAIL FILTER exercise, a false positive means to receive a junk mail in your inbox

Study These Flashcards

FALSE. A false positive is about predicting “SPAM” instead of actual “HAM” so we will move safe email to our junk folder

Normally, it is easy to increase TRUE POSITIVES if you are willing to accept also FALSE POSITIVES

Study These Flashcards

TRUE. If you tend to predict POSITIVES, you will capture TRUE POSITIVES but also FALSE POSITIVES

We call CLASSIFICATION techniques those used to predict or explain SCALE variables

Study These Flashcards

No, classification is for categorical or ordinal variables

A TREE is somewhere in the middle between pure predictive and pure explanatory techniques

Study These Flashcards

YES, it can be used to predict but also to give some information about the determinants of our variable of interest

TREES are flexible classification algorithms in the sense that they can capture complex relationships in the presence of lots of explanatory variables

Study These Flashcards

TRUE. The benefit of TREES is much more clear when there is not a single causation model for the whole population of interest. At the same time, the algorithms are able to discriminate good and bad predictors.

CHAID, CRT, C5 and QUEST are different types of TREES algorithms

Study These Flashcards

TRUE. These are very common TREES algorithms

An F - test is used by CHAID tree algorithm to select the best predictors

FALSE. It uses Chi-Square TEST

We normally hold out a part of our dataset / sample to validate our TREES

TRUE. We try to avoid the algorithm to “memorize” the sample of analysis (OVERFITTING) so we test the accuracy of our TREE in a holdout sample (not used to train the model)

In a CREDIT DEFAULT RISK exercise, a false positive means to give credit to a risky customer

FALSE. If it is about DEFAULT, a false positive means to predict RISK for a safe customer (and thus not to give him credit)

In a SPAM EMAIL FILTER exercise, a false negative means to receive a junk mail in your inbox

TRUE. We predict false “ham” instead of real SPAM ang we let the junk email to enter our inbox

In a CREDIT DEFAULT RISK exercise, a false negative is much more expensive than a false positive

TRUE. Because a false negative means give credit to a risky customer

In a CHAID exercise, lower p-values from chi-squared tests are used to identify and select the best predictors

TRUE. A low p-value for a crosstab chi-square test means evidence of association between predictor and target

Scale variables can also be used as predictors in CHAID analysis

TRUE. Scale variables are automatically transformed to ORDINAL by CHAID algorithm

An interactive session permits the user to grow a tree applying his criteria in the selection of predictors

TRUE. By using this interactive way, the analyst may influence the TREE result for the sake of a better model in terms of the business goal

The more a tree growths, the better the result we get in terms of VALIDATION

FALSE. An excessive growth increases the risk of overfitting (WORSE RESULTS IN TERMS OF EVALUATION)

We normally have to control the tree growth in order to avoid OVERFITTING

TRUE. We need to balance accuracy in the TRAIN and TEST sample

A variable may appear as predictor in a TREE more than one time, in different tree levels

TRUE. Yes, it is possible. Age may appear as the main predictor and appears again in a subset of the sample

A TREE algorithm has the ability of selecting the best predictors among a long list of candidates

TRUE. This is, in fact, one of the advantages of this type of algorithm

For ordinal predictors only adjacent categories are compared and possibly merged in a CHAID analysis

TRUE. This is because, normally, it is nonsense to merge categories that are not adjacent (people below 18 and over 65 for instance)

The GAIN of a node in a tree measures the % of "hits" in the node

FALSE. It measures this % of hits compared to overall % of hits (in the whole sample)

For categorical targets YES/NO, a classification table will always be a 2x2 table

TRUE. YES/NO predicted VS YES/NO observed

Normally, it is easy to increase TRUE POSITIVES if you are willing to accept also FALSE POSITIVES

TRUE. If you tend to predict POSITIVES, you will capture TRUE POSITIVES but also FALSE POSITIVES

Trees Analysis Flashcards

(40 cards)