Decision Tree Regression Flashcards

Question 1

Q

What does CART stand for?

Answer

A

Classification Tress and Regression Trees

Question 2

Q

How is a leave formed during a regression tree?

Answer

A

Plotted data points (x) within a graph are split from one another

… this split is decided by the algorithm based on the information entropy

|. * xxx. * xxx
|. x. * xx. * xx
| xxx. * * xxx. x
—————————

= Split line

Question 3

Q

Give some insight into what this information entropy looks for:

Answer

A

The algorithm looks for how to create more information in the graph….

Once it can’t create new information by splitting data apart, it stops splitting

Question 4

Q

Where does the decision tree part come in?

Answer

A

Once the algorithm has split the plotted data, it creates a decision tree to describe when it split the data:

E.g. Start off from where the algorithm first split the data…. e.g. At 10 on the x-axis
Then from there explain (add) the other splits

1st split:        (x-axis) < 10 (where split is) 
                               /   \
                        No /     \ Yes (means split <10) 
                             /       \
2nd split: y-axis < 3
                         /   \
                  Yes/     \ No 
                       /       \ 
                            x-axis < 25

= algorithm split

                             3rd split 
           1st split      *     x
      |.         * xxx.   *   xxx
  3  |.         * ************  2nd split 
      |.   x.   * xxxx
      |  xxx. *   xxx.   
      —————————
     0        10     20     30

Question 5

Q

How is a regression tree leave used to predict values of new data?

Answer

A

After split, the average of the data points in each leaf is calculated

= algorithm split

                             3rd split 
           1st split      *     x
      |.         * xxx.   *   xxx4
  3  |.  4       * ************  2nd split 
      |.   x.   * xxxx
      |  xxx. *   xxx. 7(made up average) 
      —————————
     0        10     20     30

If a new variable falls into any of the leaves (e.g. y = 9, 2) it’s given value will be that leaf’s group average (e.g 4, so y = 4)

Decision Tree Regression Flashcards

(5 cards)