Generalised Linear Model (week 6-8) Flashcards

1
Q

Where does GLM used for?

A

General / health insurance pricing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

GLM formula?

A

g(μ) = g(E(Y)) = α + β1X1 + … + βkXk = η

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what does g represent?

A

link function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is η

A

linear predictor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is μ?

A

g^(-1) (η)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is the symbol of dispersion parameter

A

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

b”(teta) from PDF represents

A

variance function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is canonical link

A

transform mean to natural exponential

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Why do we need GLM?

A

Because when the dist is normal, we use PDF to calc P-val or CI. However, if its normally dist, heteroskedacity, and non-linear, we use GLM

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

link function

A

we transforming the predictions, or everything except the dependent var

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

binomial (binary) follow what dist

A

logistic regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

when we use poisson?

A

if we have skewed discrete dist
-“num of time u …”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

when to use neg binomial?

A

mean and median diff, unlike poisson

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

gamma dist when to use?

A

continuous dist, var must positive >0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

how to do GLM? (long)

A
  1. what dist is this?
  2. look at the table, see which μ are u suing (formula sheet)
  3. write likelihood function ∏(fy)
  4. compute log likelihood function change ∏(fy) to ∑log(fy)
  5. fy is from formula sheet page 5 (dont forget exp can diturunin langsung kalau dikali with log)
  6. masukan the fy (from number 5) use number 2 μ
  7. derive alpha and beta and set to 0 (if we derive and hv x infront, the x stay still, gbisa di remove, if dont hv x, we can remove langsung all the alpha beta ))
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Information Criteria is

A

-Assess goodness-of-fit and parameter parsimony
-For comparison between diff linear predictors/link functions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

How too choose good IC?

A

find the lowest one

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What are 2 types of CI?

A

AIC and BIC (more likely underfit)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

forward and backward selection if look at the BIC AIC

A

same but find the lowest

20
Q

Pearson residual vs Deviance residual is used when:

A

Pearson when normal
Deviance when close to normal dist
If Y is normally dist, pearson and deviance is equal

21
Q

positive trend is when

A

when plotting the absolute standarised residual vs scaled fitted values and
b”(teta) increase too slowly

22
Q

negative trend is when

A

when plotting the absolute standarised residual vs scaled fitted values and
b”(teta) increase too fast

23
Q

Short tailed line business

A

less few years to settle all claims. e.g motor, home, fire

24
Q

Long tailed line business

A

more than new years. e.g worker’s compensation, public&product liability

25
Q

What is Outstanding Claim Liabilities? (OCL)

A

claims incurred prior valuation date but not paid by valuation date

26
Q

IBNR is

A

Incurred but not reported

27
Q

How to estimate OCL

A

expressing past claims data as a run-off triangle, then applying reserving methods (Chain Ladder Method)

28
Q

How to construct the Claims Off Run Triangle

A

yg kanan kiri itu development year(tahun dibayar) yang turun itu accident year (accident terjadi)

29
Q

How to make chain ladder method?

A
  1. cumulative
  2. find the development factor value (sum of kolom 2 / sum of kolom 1 tpi panjangny di samain)
  3. karena panjangnya disamain, kan ada value dari every last kolom/baris yang ga kepake itu dikali sama res no 2 (start from pojok kanan atas or kolom 9 )
  4. sama kaya no 3, tpi kali ini dikali sama hasil yg no 2 dri sblomnya. eg: kolom 8, itu dikali devfactor 9-10 and 8-9
  5. dikurangin sama alue dari every last kolom/baris
  6. repeat kaya no 4, jadi yes tambah banyak dikali dev factornya
30
Q

residual bootstrapping

A

allows for both process error and parameter error

31
Q

what is process error?

A

process of uncertainty, randomness of the future

32
Q

what is parameter error?

A

uncertainty when fitting to a model

33
Q

how to do residual bootstrapping?

A
  1. bootstrap data: residualnya di pick randomly with replacement
  2. bootstrap data di combine with fitted values to generate pseudo data
  3. new model is fitted to pseudo data
  4. expected OCL for pseudo data is estimated
  5. repeat!
34
Q

what does pmax do?

A

set minimum

35
Q

Logistic regression

A

When GLM with binomial distribution and logit link function (canonical link)

36
Q

What is the estimated prob for logistic regression?

A

1

37
Q

What happen if we lowering the threshold?

A

Increase true positivity but also increase false positive

38
Q

What is False Positive?

A

When “yes” but no

39
Q

What happen if we increase the threshold?

A

Reduce false positive but increase false negative

40
Q

What is false positive?

A

When “no” but yes

41
Q

Downside of logistic reggresion?

A

sensitive to class imbalance, model may predict majority class more frequently

42
Q

How to solve the class imbalance?

A

Oversampling the minority class

43
Q

How to do logistic regression

A

1.make oversampling and summary data
2. model the data use data$explanatory var
3. fit into glm use family = binomial and link = logit
4. combine explanatory variables
5. using the new improved model, we do model checking
6. check TP, FP, TN, FN
7. Check the ratio between TP and FP

44
Q
A
45
Q
A