Paired roc curves in fbroc 0.3.0

By | July 5, 2015

Currently I am working on fbroc 0.3.0. The main feature will be the possibility for paired ROC curves. Paired ROC curves are needed when you are comparing two different classifiers on the same dataset. From my experience this use case is even more common than looking at a classifier in isolation, since there is usually at least an imperfect standard method already. If this is not the case, then what you are trying to predict might not be considered interesting by other people.

As an example, look at the different tests available for HIV. They have different sensitivities, specificities and costs, so that it is important to know the limitations and advantages of each test.

Bootstrapping paired ROC curves

When bootstrapping paired data, it is important to not just bootstrap both classifier separately but do so jointly. This means that you have to use the same bootstrap shuffles for both classifiers. Remember that

$latex mathrm{Var}(X – Y) = mathrm{Var}(X) + mathrm{Var}(Y) – 2 mathrm{Cov}(X, Y)$.

So when X and Y are highly correlated, the variance of the difference is smaller than the individual variances of the two classifiers. Often a sample that is different to classify in one prediction model is also different for the alternative model. This leads to high correlation between two different classifiers. Therefore, if you want to compare the AUC of two classifiers it is important to use the correct bootstrap procedure. Bootstrapping both classifier separately ignores this correlation, leading to misleading results.

Since fbroc now has classes for ROC curves and they have been written in mind of me wanting to implement this feature, the amount of required work in the C++ section of the fbroc code was negligible. The only difficulty was that much of the code requires the observations to be sorted in order of ascending prediction values. Because this ordering is not be the same for both classifiers, I had to shuffle some of the outputs back in the original order.

Estimating performance for paired ROC curves with fbroc

The interface for working with paired ROC curves is similar to what is already implemented in fbroc 0.2.1. Here is a code example for you.

data(roc.examples)
result.boot <- boot.paired.roc(roc.examples$Cont.Pred, 
                               roc.examples$Cont.Pred.Outlier,
                               roc.examples$True.Class)
result.perf <- perf.paired.roc(result.boot)
str(result.perf)
List of 13
 $ Observed.Performance.Predictor1: num 0.929
 $ CI.Performance.Predictor1      : num [1:2] 0.891 0.962
 $ Observed.Performance.Predictor2: num 0.895
 $ CI.Performance.Predictor2      : num [1:2] 0.842 0.944
 $ Observed.Difference            : num 0.0341
 $ CI.Performance.Difference      : num [1:2] 0 0.0761
 $ conf.level                     : num 0.95
 $ Cor                            : num 0.654
 $ metric                         : chr "AUC"
 $ params                         : num 0
 $ n.boot                         : int 1000
 $ boot.results.pred1             : num [1:1000] 0.92 0.963 0.932 0.928 0.891 ...
 $ boot.results.pred2             : num [1:1000] 0.899 0.951 0.882 0.907 0.866 ...
 - attr(*, "class")= chr [1:2] "list" "fbroc.perf.paired"

Unfortunately, I have not yet implemented a nice printing function yet, but even so the output should be understandable. Note that this is an example where there is a high correlation between both classifiers, so that the confidence interval for the difference is actually smaller than that for the performance of the second classifier.

Visualizing the relative performance of paired ROC curves

Of course, fbroc 0.3.0 will also allow you to plot paired ROC curves. The straightforward way is to just show two ROC curves in the same plot, similar to the ROC curves for each classifier.

Two paired ROC curves with their overlapping confidence regions.

Two paired ROC curves with their overlapping confidence regions.

It is also illuminating for which False Positive Rates our favored classifier is actually better than the alternative. If a FPR smaller than 10% is required, than a higher TPR for a FPR between 30 and 40% is worthless.

Difference in True Positive Rate (TPR) for two paired classifiers

Difference in True Positive Rate (TPR) for two paired ROC curves

Note that these graphs are both work in progress. For example, I am not completely happy with the default colors yet. I will also implement a similar plot for the difference in FPR over the TPR, but this required some small work on the C++ side of the code.

Note

Please remember that the released versions of fbroc do not yet contain these features. If you really need them now drop me a note or try to get the latest version from github to work for you.

Leave a Reply

Your email address will not be published. Required fields are marked *