Block 5: Tree-based methods, bootstrap, bagging, data ethics Flashcards

1
Q

Explain Boosting

A

Aim: reduce error rate by putting more weights on previously incorrectly classified

  • Initial weights are wi = 1/n
  • For loop to build M classifiers :
  • fit Gm(x) with weighted observations wi xi
  • errm = sum (wi 1{yi ≠ Gm(xi)} ) / sum (wi)
  • αm = log((1-errm)/errm) and wi = wi exp{αm 1{yi ≠ Gm(xi)} }
  • Final G*(x) = sgn { sum(m=1 to M) αm Gm(x) }
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Explain classification trees

A

Build tree:
- decide which variable Xj to split and at which value s (and binary or ternary split) by minimising
min(j,s) { min(c1) {sum (Yi_R1 - c1^)^2} + min(c2) {sum (Yi_R2 - c2^)^2} } where c1^ and c2^ are respectively the centroids of R1 and R2
- stop when sum of squares is not smaller and get the final M leaves
- predict value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Explain classification trees

A

Build tree using train data:
- decide which variable Xj to split and at which value s (and binary or ternary split) by minimising
min(j,s) { min(c1) {sum (Yi_R1 - c1^)^2} + min(c2) {sum (Yi_R2 - c2^)^2} } where c1^ and c2^ are respectively the centroids of R1 and R2
- stop when sum of squares is not smaller and get the final M leaves
- or stop using cost-compl

Predict value:
- the predicted value yi^ is the centroid c^j of the leaf Rj when xi ∈ Rj

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Pros and cons of Classification and Regression Tree (CART)

A

Pros: fast and simple method
Cons: lack of continuity (very volatile) and inefficient in some cases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Explain cost-complex pruning

A
  • create over-fitted tree T0 with for example 5 or less data points on each leaf or a minimal number of nodes
  • create a subtree T ⊂ T0 is T0 with collapsed nodes (regroup leaves) minimising the cost-complexity criterion:
    Cα(T) = sum(m=1 to |T|) nmQm(T) + α|T| where Qm is the sum of square for leaf m (wrt centroid) and where α is a hyperparameter
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Explain Bootstrap and Bagging

A

Aim : get sampling distribution statistical information (mean, variance,…) without making strong assumptions on Xi or F.

  • create B random sub-sample of X with replacement
  • eg. estimate statistic θ^(x) from estimator F^(x) = 1/n sum(i=1 to n) 1{Xi <= x)

Bagging:
f^bag(x) = 1/B sum f^b(x)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Explain Random Forest

A

Random forest is similar to Bootstrap but adding a step which takes a sample of variables to choose from at each node

How well did you know this?
1
Not at all
2
3
4
5
Perfectly