Predicting skin sensitizers with confidence – Using conformal prediction to determine applicability domain
Objective
GARD – Genomic Allergen Rapid Detection – is a state of the art non-animal based technology platform for classification of skin sensitizing chemicals. The assay has proven to be reliable and highly accurate for identification of skin sensitizing chemicals, and consistently reports predictive performances > 90% across external test sets. The aim of the current project is to complement assessments of average model performance with an estimate of uncertainty involved in each individual prediction, thus allowing for classification of skin sensitizers with confidence.
Results
An Internal validation procedure was initially performed on samples in the GARD training set (n=38) using the strategy described in Fig. 3A. Results from this exercise is summarized in Fig. 4A. Conformal prediction by definition allows the user to determine a reasonable and acceptable significance level to guarantee a maximum error rate in predictions. The significance level was set to 15%, i.e. the model was allowed to make a maximum of 15% errors. Performance of the conformal predictor was measured by validity and efficiency. A model was valid if the number of prediction errors did not exceed the significance level, while efficiency corresponded to the percentage of single class predictions. Internal validation of the training data resulted in a valid and highly effective model (92% single classifications, 1 empty, 2 both), indicating that the ambitious significance level was at a reasonable level for the GARD® assay. Following internal validation, samples in a large external test set (n =70) was classified within the CP framework as described in Fig. 3B, which resulted in generation of a valid and highly efficient model (99% single classifications, 0 empty, 1 both) (Fig.4B). Additional data on model performance is illustrated in Table 1.