Predicting biodegradation products and pathways: a hybrid knowledge- and machine learning-based approach

Jörg Wicker, Kathrin Fenner, Lynda Ellis, Larry Wackett, Stefan Kramer: Predicting biodegradation products and pathways: a hybrid knowledge- and machine learning-based approach. In: Bioinformatics, 26 (6), pp. 814-821, 2010.

Abstract

Motivation: Current methods for the prediction of biodegradation products and pathways of organic environmental pollutants either do not take into account domain knowledge or do not provide probability estimates. In this article, we propose a hybrid knowledge- and machine learning-based approach to overcome these limitations in the context of the University of Minnesota Pathway Prediction System (UM-PPS). The proposed solution performs relative reasoning in a machine learning framework, and obtains one probability estimate for each biotransformation rule of the system. As the application of a rule then depends on a threshold for the probability estimate, the trade-off between recall (sensitivity) and precision (selectivity) can be addressed and leveraged in practice.Results: Results from leave-one-out cross-validation show that a recall and precision of ∼0.8 can be achieved for a subset of 13 transformation rules. Therefore, it is possible to optimize precision without compromising recall. We are currently integrating the results into an experimental version of the UM-PPS server.Availability: The program is freely available on the web at http://wwwkramer.in.tum.de/research/applications/biodegradation/data.Contact: kramer@in.tum.de

BibTeX (Download)

@article{wicker2010predicting,
title = {Predicting biodegradation products and pathways: a hybrid knowledge- and machine learning-based approach},
author = {Jörg Wicker and Kathrin Fenner and Lynda Ellis and Larry Wackett and Stefan Kramer},
url = {http://bioinformatics.oxfordjournals.org/content/26/6/814.full},
doi = {10.1093/bioinformatics/btq024},
year  = {2010},
date = {2010-01-01},
journal = {Bioinformatics},
volume = {26},
number = {6},
pages = {814-821},
publisher = {Oxford University Press},
abstract = {Motivation: Current methods for the prediction of biodegradation products and pathways of organic environmental pollutants either do not take into account domain knowledge or do not provide probability estimates. In this article, we propose a hybrid knowledge- and machine learning-based approach to overcome these limitations in the context of the University of Minnesota Pathway Prediction System (UM-PPS). The proposed solution performs relative reasoning in a machine learning framework, and obtains one probability estimate for each biotransformation rule of the system. As the application of a rule then depends on a threshold for the probability estimate, the trade-off between recall (sensitivity) and precision (selectivity) can be addressed and leveraged in practice.Results: Results from leave-one-out cross-validation show that a recall and precision of ∼0.8 can be achieved for a subset of 13 transformation rules. Therefore, it is possible to optimize precision without compromising recall. We are currently integrating the results into an experimental version of the UM-PPS server.Availability: The program is freely available on the web at http://wwwkramer.in.tum.de/research/applications/biodegradation/data.Contact: kramer@in.tum.de},
keywords = {application, biodegradation, cheminformatics, computational sustainability, enviPath, machine learning, metabolic pathways},
pubstate = {published},
tppubtype = {article}
}