Wednesday, September 12, 2012

A common pitfall of PDO (Points Double the Odd)


Points Double the Odds (PDO) is a widely used metric to track scorecard performance over time and measure the deterioration of scorecard’s ranking ability. It is calculated on validation data and be compared to the fixed PDO based on benchmark data. An increase in PDO indicates deterioration in model’s ranking performance and the scorecard hence needs to be recalibrated or even redeveloped.

However, such use of PDO is dangerous and an acceptable PDO value may be misleading. It is not unusual to observe a stable validation PDO with poor performance which indicated by other metrics such AUC and KS. Let us go through the calculation of PDO to see why. Assume there is an existing score to rank customers’ risk, the higher score, the less risk. We first need to fit a logistic regression with the following form,
\[log\left ( \frac{p}{1-p} \right )=intercept + slope \times score\]
\( p \) stands for the probability of being ‘good’. We compute PDO by solving the following equation simultaneously,
\[log\left ( \frac{2p}{1-p} \right )=intercept + slope \times \left ( score+PDO \right )\]
which illustrate how increasing scores to double the odds. As a result, \(PDO = \frac{log2}{slope}\);
However something is missing here. How about we put on some hats on the equation?
\[\widehat{PDO}= \frac{log2}{\widehat{slope}}\]

What is the implication of those hats? The hats means PDO, as well as the slope, is a statistic calculated from a sample with uncertainty. Fortunately, the uncertainty can be measured by interval estimate. Since only the increase in PDO is of concerned, a percentile estimate on the right tail, 95th percentile for instance, noted as \(\widehat{PDO_{95}}\),  can be used as another metric to measure model’s deterioration. In addition, \(\widehat{PDO_{95}}= \frac{log2}{\widehat{slope_{5}}} \), and \(\widehat{slope_{5}}\) is easy to calculated by using normal approximation.

A typical case is that a modeler observes the PDO in validation is 20.5 compared to 20 in benchmark, which seems acceptable, but the 95th percentile of validation PDO is 25 and indicates the scorecard’s stability may be questionable. A one-side hypothesis test can be formulated to check whether the PDO estimate is significantly less than a pre-specified threshold.

BTW, people may ask a question—do we also need to consider the interval estimates of KS and AUC, since both of them are estimates with uncertainty? The answer is usually NO. The interval estimates of Kolmogrov-Simirnov statistic and Mann–Whitney U statistic do not depend on the score empirical distributions in good and bad groups, although do their point estimates. In other words, scorecard deterioration has impact only on the point estimates of KS and AUC (instead, a change in two samples’ bad rates does relate to two interval estimates, but the magnitude is small in real world case), and thus their estimated values are directly comparable without the concern on uncertainty. However the scorecard deterioration may have impact on one or both of mean and variance estimates of PDO. Checking on one but ignoring the other would miss the critical information.

Saturday, September 1, 2012

Another way to simulate multinomial distribution using SAS

SAS has a variety of built-in data-step functions to generate a random variable based upon some chosen distributions, but most of the distributions are of continuous variables. Since version 9.2, SAS develops a built-in IML function RANDMULTINOMIAL() to simulate a  multinomial distribution, and its algorithm derived from binomial distribution was also written as a self-defined IML function which can be used in older SAS versions. However SAS users who have no access to SAS/IML product may need to re-write the function on data-step language or via PROC FCMP. Or, we can tweak PROC SURVEYSELECT for the task by just several rows of simple and straight-forward codes.

First let us specify the parameters of a 4-class multinomial distribution;

data multinomial_dist;
    class=1; prob=0.13; output;
    class=2; prob=0.07; output;
    class=3; prob=0.39; output;
    class=4; prob=0.41; output;
run;


The following code simulate a random sample of 10000 trials.

proc surveyselect data=multinomial_dist sampsize=10000 out=sample OUTSIZE outhits method=pps_wr;
    size prob;
run;