Retail Credit Risk Flashcards

1
Q

define retail lending

A

exposure to an individual/small business, and guaranteed by such person

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what are 4 examples of retail lending.

A

credit cards
residential mortgages
small business facilities
installment loans

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what are two characteristics of retail lending

A

low individual exposure
managed collectively rather than individually

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is a credit risk score

A

a total number of points that predicts a borrower’s future repayment performance based on historical information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is a scorecard

A

a mathematical algorithm used to generate a score for rank-order risk analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what are scorecards used for

A
  1. lending decisions
  2. mitigation of portfolio credit risk
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what are two benefits of using a scorecard

A

easy to interpret
easy to monitor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what are the 6 stages in model development

A
  1. business objectives
  2. data preparation
  3. model development
  4. model approval
  5. model deployment
  6. monitoring
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what are 3 aspects of business objectives

A
  1. key issues
  2. expectations for the model
  3. structure
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

define key issues

A

trends, challenges and concerns outlined by the business

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

define structure

A

project team members, data and timeline

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what are the 5 C’s of data preparation

A
  1. Comprehensiveness
  2. Clean
  3. Consistent
  4. Current
  5. Caretaking
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

define comprehensiveness

A

ensuring the data captures the full scope and complexity of the underlying information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

define clean

A

ensuring the accuracy of the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

define consistent

A

ensuring the uniformity of the data across different sources.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

define current

A

ensuring the data is up to date

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

define caretaking

A

the ongoing management of the data to preserve its quality

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

what are 6 aspects of the data preparation in the model development lifecycle

A

the 5 C’s
exclusion criteria
timeframe
defining the target and explanatory variables
segmentation (# of models)
sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

what are three sources of exclusion criteria

A

scope
data errors
operational

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

what two periods are involved in the timeframe of model creation

A
  1. observation period
  2. performance period
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

what are two aspects of the observation periods

A
  1. for explanatory variables
  2. should be representative of the current/future environment
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

what are two aspects of performance periods

A
  1. for the target variable
  2. should be long enough to have a sufficient number of defaults.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

what are the two modeling techniques

A
  1. industry standard
  2. other methodologies
24
Q

compare the advantages of the two modeling techniques

A

industry-standard:
1. few variables
2. expert judgment

other methods:
1. many variables
2. one step for variable reduction and model fitting
3. adaptive learning

25
Q

compare the disadvantages of the two modeling techniques

A

industry-standard:
1. few variables
2. distributional assumptions
3. separate steps for variable reduction and model fitting

other methods:
1. many variables
2. risk of overfitting

26
Q

what are the 5 steps of the industry standard model development technique?

A
  1. variable transformation
  2. variable reduction
  3. model fitting
  4. scorecard scaling
  5. scorecard assessment
27
Q

what technique can be used in variable transofrmation

A

weight of evidence

28
Q

define variable reduction

A

removing any variable that cannot be used or doesnt make sense

29
Q

what are two techniques for variable reduction

A
  1. grouping
  2. variable clustering
30
Q

what is grouping

A

creating bins within a variable

31
Q

what are three benefits of grouping?

A

i. Accounts for non-linear relationship between the target and explanatory variables.
ii. Accounts for outliers
iii. Allows for the treatment of missing values as a separate category.

32
Q

how should grouping be performed?

A
  1. Start by creating 20 equal bins.
  2. Calculate the WOE of each bin.
  3. Collapse bins with similar WOE.
  4. Remove variables with weak IV.
33
Q

what is variable clustering

A

grouping correlated variables together such that variables within a cluster are highly correlated and variables between of clusters are uncorrelated two reduce the multicollinearity of the model.

34
Q

which two variables should represent the cluster then using variable clustering?

A
  1. the variable with the highest IV
  2. the variable with the lowest 1-R^2
35
Q

what are two aspects of model fitting in the industry standard technique?

A

variable selection: forward, backwards, ridge lasso
assumptions that historical experiences predict future behaviour and that consumer behaviour will not change significantly

36
Q

define scorecard scaling when using the industry standard technique

A

raw scores are scaled to a three digit number

37
Q

what is the formula in score in scorecard scaling

A

score=offset+(factor⋅ln⁡(2⋅odds) )-PDO

38
Q

what are the 3 types of scorecard assessment

A
  1. rank ordering
  2. population stability
  3. benchmarking
39
Q

what are the 5 evaluation metrics used in rank ordering scorecard assessment

A
  1. KS statistic
  2. misclassification
  3. ROC curve
  4. accuracy ratio
  5. lift chart
40
Q

what does population stability do

A

quantify population differences by measuring the shift between two sample distributions

41
Q

what is the formula for the population shift index (PSI)

A

PSI=∑[(N_bin-B_bin )⋅ln⁡(N_bin/B_bin ) ]

42
Q

what values of PSI indicate: no significant shift, a minor shift, a significant shift

A

<0.1: no significant shift
0.1-0.25: minor shift
>0.25: significant shift

43
Q

what is benchmarking

A

comparing a scorecard to an existing scorecard

44
Q

what is the KS statistic

A

the maximum difference between the CDFs of the distributions of defaults and non-defaults

45
Q

what is misclassification

A

the confusion matrix

46
Q

what is the ROC curve

A

the probability a randomly chosen non-default will be ranked righter than a randomly chosen default; plots the true positive rate against the false positive rate

47
Q

what is the formula of the accuracy ratio

A

AR=GINI/(Perfect GINI)

48
Q

what is the GINI index

A

the area between the Lorenz and random curve

49
Q

what is a lift chart

A

the cumulative % of defaults per decile divided by the total population % of defaults.

50
Q

what does weight of evidence do

A

transforms explanatory variables into a set of groups based on the similarity of the target variable distributions.

51
Q

what does WOE measure

A

how strong a group is at separating defaults from non-defaults

52
Q

what does a negative WOE signify?

A

more defaults than non-defaults

53
Q

what is the formula for WOE

A

WOE=ln⁡[((# non-defaults)/(total non-defaults))/((# defaults)/(total defaults))]

54
Q

what is a variable’s information value

A

the predictive power of a single variable (its ability to separate defaults from non-defaults)

55
Q

what is the formula for information value

A

IV=∑[[(# non-defaults)/(total non-defaults)-(# defaults)/(total defaults)]⋅WOE_i

56
Q

what IV value ranges indicate:
very weak
weak
moderate
strong

A

<0.02: very weak
0.02-0.1: weak
0.1-0.3: moderate
0.3+: strong