fbroc 0.2.1 release

By | June 7, 2015

The fbroc 0.2.1 release was added to CRAN yesterday. Actually, I released fbroc 0.2.0 the day before and then had to update again due to a pointer bug found by valgrind as kindly pointed out by Brian Ripley, which did causes crashes or erros in my testing. Further versions will always be tested thoroughly with valgrind before I release them to avoid this happening again.

The complete change log is below, but the main new features are

  • Analysis of TPR at a fixed FPR and vice versa
  • Included example data to test the package
  • Minor performance increase due to reduced memory overhead

If you want to get started quickly, the shiny app was also updated and can be found here.

Example

In the example data now included with fbroc, one of the numerical predictors suffers from extreme outliers in the negative class. These outliers have higher values than all other samples, including the positives. As a result the number of outliers included in the bootstrap sample determines the shape of the ROC curve near a false positive rate of 0. Let us look at the data in the shiny app. You can also generate the same plot in the R console.

require(fbroc)
data(roc.examples)
roc.obj <- boot.roc(roc.examples$Cont.Pred.Outlier, roc.examples$True.Class,
                    n.boot = 2000)
plot(roc.obj, show.metric = "tpr", fpr = 0.03)
The TPR at a FPR of 0.03 depends upon how many of the outliers are included, making the confidence intervall very wide.

Using fbroc 0.2.1 and the updated shiny app to analyse example data included in the package. There are some extreme outliers in the negative samples, so that there is no cutoff with a TPR > 1 at a FPR of zero. The TPR at a FPR of 0.03 depends upon how many of the outliers are included, making the confidence intervall very wide.

Note the wide confidence intervals. In this case, the performance histogram is very useful since it shows that the distribution has two discrete peaks. Again, you can also use the R console.

perf.obj <- perf.roc(roc.obj, "tpr", fpr = 0.03)
plot(perf.obj)
The histogram is bimodal with a large peak at zero and a smaller peak at around 0.45.

Histogram of the bootstrapped TPR at a fixed FPR of 0.03. There is a high chance of a TPR of zero and a small chance of a larger TPR if the outliers are not included in the bootstrap sample.

Please feel free to experiment further. The results for the discrete predictor can be very interesting as well. Note that I documented how fbroc handles discrete predictors here.

Comple changelog

fbroc 0.2.1

Bugfixes

  • fixed a off-by-one pointer error

fbroc 0.2.0

New features

  • Allow uncached bootstrap of the ROC curve to avoid memory issues, this now the new default
  • New performance metrices: TPR at fixed FPR and FPR at fixed TPR

Other changes

  • Stand-alone function to find thresholds calculate.thresholds was removed. To calculate thresholds
    please call boot.roc and look at list item roc of the outpot
  • Smarter default for the number of steps in conf.roc
  • Smarter default for the number of bins in plot.fbroc.perf

Internal Changes

  • Completely refactored C++ code for improved maintability

Bugfixes

  • Function boot.tpr.at.fpr now works properly
  • For duplicated predictions not all relevant thresholds were found reliably, this was fixed

Leave a Reply

Your email address will not be published. Required fields are marked *