Holistic Evaluation of Biodegradation Pathway Prediction: Assessing Multi-Step Reactions and Intermediate Products

Jason Tam, Tim Lorsbach, Sebastian Schmidt, Jörg Wicker: Holistic Evaluation of Biodegradation Pathway Prediction: Assessing Multi-Step Reactions and Intermediate Products. In: ChemRxiv, 2021, (preprint).

Abstract

The prediction of metabolism and biotransformation pathways of xenobiotics is a highly desired tool in environmental and life sciences. There are several systems that currently predict single transformation steps or complete pathways as series of parallel and subsequent steps. Their accuracy is often evaluated on the level of a single transformation step. Such an approach cannot account for some specific challenges that are related to the nature of the biotransformation experiments. This is particularly true for missing transformation products in the reference data that occur only in low concentrations, e.g. transient intermediates or higher-generation metabolites. Furthermore, some rule-based prediction systems evaluate accuracy only based on the defined set of transformation rules. Therefore, the performance of different models cannot be directly compared.In this paper, we introduce a new evaluation framework that extends the evaluation of biotransformation prediction to holistically evaluating predicted pathways, taking into account multiple generations of metabolites. We introduce a procedure to address transient intermediates and propose a weighted scoring system that acknowledges the uncertainty of higher-generation metabolites. We implemented this framework in enviPath and demonstrate its strict performance metrics on predictions of in vitro biotransformation and degradation of xenobiotics in soil.Our approach is model-agnostic and can be transferred to other prediction systems. It is also capable of revealing knowledge gaps in terms of incompletely defined sets of transformation rules.

BibTeX (Download)

@article{tam2021holistic,
title = {Holistic Evaluation of Biodegradation Pathway Prediction: Assessing Multi-Step Reactions and Intermediate Products},
author = {Jason Tam and Tim Lorsbach and Sebastian Schmidt and J\"{o}rg Wicker},
url = {https://chemrxiv.org/articles/preprint/Holistic_Evaluation_of_Biodegradation_Pathway_Prediction_Assessing_Multi-Step_Reactions_and_Intermediate_Products/14315963},
doi = {10.26434/chemrxiv.14315963},
year  = {2021},
date = {2021-03-27},
journal = {ChemRxiv},
abstract = {The prediction of metabolism and biotransformation pathways of xenobiotics is a highly desired tool in environmental and life sciences. There are several systems that currently predict single transformation steps or complete pathways as series of parallel and subsequent steps.  Their accuracy is often evaluated on the level of a single transformation step.  Such an approach cannot account for some specific challenges that are related to the nature of the biotransformation experiments.  This is particularly true for missing transformation products in the reference data that occur only in low concentrations, e.g.  transient intermediates or higher-generation metabolites. Furthermore, some rule-based prediction systems evaluate accuracy only based on the defined set of transformation rules. Therefore, the performance of different models cannot be directly compared.In this paper, we introduce a new evaluation framework that extends the evaluation of biotransformation prediction to holistically evaluating predicted pathways, taking into account multiple generations of metabolites.  We introduce a procedure to address transient intermediates and propose a weighted scoring system that acknowledges the uncertainty of higher-generation metabolites.  We implemented this framework in enviPath and demonstrate its strict performance metrics on predictions of in vitro biotransformation and degradation of xenobiotics in soil.Our approach is model-agnostic and can be transferred to other prediction systems.  It is also capable of revealing knowledge gaps  in  terms  of  incompletely  defined  sets  of  transformation rules.},
note = {preprint},
keywords = {biodegradation, cheminformatics, computational sustainability, data mining, enviPath, machine learning, metabolic pathways},
pubstate = {published},
tppubtype = {article}
}