Jörg Simon Wicker
Senior Lecturer | CEO | Founder

| [in]​ | 🎓 | ✉️ | Scopus | iD |

enviPath Limited

My research is both in applied and non-applied machine learning. Currently, I am particularly interested in reliability of machine learning algorithms, adversarial machine learning, and bias, with applications in chemistry and environmental science. For more information about my current research, please check our lab webpage.

enviPath Limited Co-founder and CEO of enviPath Limited, a university spin-out around the enviPath system and building AI solutions for chemistry. It builds on more than 15 years of research in the area. We employ a team of experts in AI, Machine Learning, Chemistry, and Software Engineering.


University of Auckland Senior Lecturer at the School of Computer Science of the University of Auckland. Check out my lab page for more details. We are always looking for students interested in joining our lab.


University of Auckland Winner of the Ig Nobel Prize in Chemistry in 2021, for chemically analyzing the air inside movie theaters, to test whether the odors produced by an audience reliably indicate the levels of violence, sex, antisocial behavior, drug use, and bad language in the movie the audience is watching [26, 49, 53, 71, 72].




Experience

enviPath Limited
CEO & Co-Founder
since December 2024
University of AucklandSchool of Computer Science
Senior Lecturer
since February 2020
Lecturer
August 2017 – January 2020
enviPath UG & Co. KG
Co-Founder
CTO
September 2019 – November 2025
CEO
January 2017 – August 2019
Johannes Gutenberg University MainzData Mining Group
Research Associate
November 2011 - August 2017
Technical University of Munich – Machine Learning and Data Mining in Bioinformatics Group
Research Associate
July 2007 - September 2011

Education

Technical University of Munich – PhD – Computer Science
July 2007 - September 2013
Ludwig Maximilian University of Munich & Technical University of Munich – Diplom (equivalent to M.Sc.) - Bioinformatics
September 2000 - May 2007

Publications

[2026-01-07 Wed]

Book chapters

[1]
Wicker, J., Fenner, K. and Kramer, S. 2016. A hybrid machine learning and knowledge based approach to limit combinatorial explosion in biodegradation prediction. Computational sustainability. J. Lässig, K. Kersting, and K. Morik, eds. Springer International Publishing. 75–97.
[2]
Wicker, J., Richter, L. and Kramer, S. 2010. Sindbad and siql: Overview, applications and future developments. Inductive databases and constraint-based data mining. S. Džeroski, B. Goethals, and P. Panov, eds. Springer New York. 289–309.

Papers in peer-reviewed journals

[3]
Dost, K., Muraoka, K., Ausseil, A.-G., Benavidez, R., Blue, B., Conland, N., Daughney, C., Semadeni-Davies, A., Hoang, L., Hooper, A., Kpodonu, T.A., Marapara, T., McDowell, R., Nguyen, T., Nguyet, D.A., Norton, N., Özkundakci, D., Pearson, L., Rolinson, J., Smith, R., Stephens, T., Tamepo, R., Taylor, K., van Uitregt, V., Jackson, B., Sarris, T., Elliott, A. and Wicker, J. 2026. Freshwater modeling in aotearoa new zealand: Current practice and future directions. Environmental modelling & software. 197, (2026), 106820. https://doi.org/https://doi.org/10.1016/j.envsoft.2025.106820.
[4]
Yang, Q., Wang, L., Wicker, J. and Dobbie, G. 2025. Continual learning: A systematic literature review. Neural networks. (Oct. 2025), 108226. https://doi.org/10.1016/j.neunet.2025.108226.
[5]
Dai, K., Kim, J., Džeroski, S., Wicker, J., Dobbie, G. and Dost, K. 2025. Assessing the risk of discriminatory bias in classification datasets. Machine learning. 114, (Aug. 2025), 204. https://doi.org/10.1007/s10994-025-06843-9.
[6]
Miller, C.J., Golovina, E., Gokuladhas, S., Wicker, J., Jacobson, J.C. and O\\’Sullivan, J.M. 2025. Unraveling adhd: genes, co-occurring traits, and developmental dynamics. Life science alliance. 8, 5 (Feb. 2025). https://doi.org/10.26508/lsa.202403029.
[7]
Brydon, L., Zhang, K., Dobbie, G., Taskova, K. and Wicker, J. 2025. Predictive modeling of biodegradation pathways using transformer architectures. Journal of cheminformatics. 17, 1 (Feb. 2025), 21. https://doi.org/10.1186/s13321-025-00969-7.
[8]
Albrecht, S., Broderick, D., Dost, K., Cheung, I., Nghiem, N., Wu, M., Zhu, J., Poonawala-Lohani, N., Jamison, S., Rasanathan, D., Huang, S., Trenholme, A., Stanley, A., Lawrence, S., Marsh, S., Castelino, L., Paynter, J., Turner, N., McIntyre, P., Riddle, P., Grant, C., Dobbie, G. and Wicker, J. 2024. Forecasting severe respiratory disease hospitalizations using machine learning algorithms. Bmc medical informatics and decision making. 24, (Oct. 2024), 293. https://doi.org/10.1186/s12911-024-02702-0.
[9]
Hua, Y.C., Denny, P., Wicker, J. and Taskova, K. 2024. A systematic review of aspect-based sentiment analysis: Domains, methods, and trends. Artificial intelligence review. 57, 11 (Sep. 2024), 296. https://doi.org/10.1007/s10462-024-10906-z.
[10]
Hafner, J., Lorsbach, T., Schmidt, S., Brydon, L., Dost, K., Zhang, K., Fenner, K. and Wicker, J. 2024. Advancements in biotransformation pathway prediction: Enhancements, datasets, and novel functionalities in envipath. Journal of cheminformatics. 16, 1 (Aug. 2024), 93. https://doi.org/10.1186/s13321-024-00881-6.
[11]
Lyu, J., Dost, K., Koh, Y.S. and Wicker, J. 2024. Regional bias in monolingual english language models. Machine learning. (Jul. 2024). https://doi.org/10.1007/s10994-024-06555-6.
[12]
Long, D., Eade, L., Dost, K., Meier-Menches, S.M., Goldstone, D.C., Sullivan, M.P., Hartinger, C., Wicker, J. and Taskova, K. 2024. Adducthunter: Identifying protein-metal complex adducts in mass spectra. Journal of cheminformatics. 16, (Feb. 2024). https://doi.org/10.1186/s13321-023-00797-7.
[13]
Miller, C.J., Golovina, E., Wicker, J., Jacobson, J.C. and O\\’Sullivan, J.M. 2023. De novo network analysis reveals autism causal genes and developmental links to co-occurring traits. Life science alliance. 6, 10 (Aug. 2023). https://doi.org/10.26508/lsa.202302142.
[14]
Dost, K., Pullar-Strecker, Z., Brydon, L., Zhang, K., Hafner, J., Riddle, P. and Wicker, J. 2023. Combatting over-specialization bias in growing chemical databases. Journal of cheminformatics. 15, (May 2023), 53. https://doi.org/10.1186/s13321-023-00716-w.
[15]
Bensemann, J., Cheena, H., Huang, D.T.J., Broadbent, E., Williams, J. and Wicker, J. 2023. From what you see to what we smell: Linking human emotions to bio-markers in breath. Ieee transactions on affective computing. (May 2023), 1–13. https://doi.org/10.1109/TAFFC.2023.3275216.
[16]
Roeslin, S., Ma, Q., Chigullapally, P., Wicker, J. and Wotherspoon, L. 2023. Development of a seismic loss prediction model for residential buildings using machine learning christchurch, new zealand. Natural hazards and earth system sciences. 23, 3 (Mar. 2023), 1207–1226. https://doi.org/10.5194/nhess-23-1207-2023.
[17]
Pullar-Strecker, Z., Dost, K., Frank, E. and Wicker, J. 2022. Hitting the target: Stopping active learning at the cost-based optimum. Machine learning. (Oct. 2022). https://doi.org/10.1007/s10994-022-06253-1.
[18]
Tam, J., Lorsbach, T., Schmidt, S. and Wicker, J. 2021. Holistic evaluation of biodegradation pathway prediction: Assessing multi-step reactions and intermediate products. Journal of cheminformatics. 13, 1 (Sep. 2021), 63. https://doi.org/10.1186/s13321-021-00543-x.
[19]
Stepišnik, T., Škrlj, B., Wicker, J. and Kocev, D. 2021. A comprehensive comparison of molecular feature representations for use in predictive modeling. Computers in biology and medicine. 130, (Mar. 2021), 104197. https://doi.org/10.1016/j.compbiomed.2020.104197.
[20]
Roeslin, S., Ma, Q., Juárez-Garcia, H., Gómez-Bernal, A., Wicker, J. and Wotherspoon, L. 2020. A machine learning damage prediction model for the 2017 puebla-morelos, mexico, earthquake. Earthquake spectra. 36, 2 (Jul. 2020), 314–339. https://doi.org/https://doi.org/10.1177/8755293020936714.
[21]
Jonauskaite, D., Wicker, J., Mohr, C., Dael, N., Havelka, J., Papadatou-Pastou, M., Zhang, M. and Oberfeld, D. 2019. A machine learning approach to quantifying the specificity of color-emotion associations and their cultural differences. Royal society open science. 6, 9 (Sep. 2019), 190741. https://doi.org/10.1098/rsos.190741.
[22]
Stönner, C., Edtbauer, A., Derstorff, B., Bourtsoukidis, E., Klüpfel, T., Wicker, J. and Williams, J. 2018. Proof of concept study: Testing human volatile organic compounds as tools for age classification of films. Plos one. 13, 10 (Oct. 2018), 1–14. https://doi.org/10.1371/journal.pone.0203044.
[23]
Wicker, J. and Kramer, S. 2017. The best privacy defense is a good privacy offense: Obfuscating a search engine user’s profile. Data mining and knowledge discovery. 31, 5 (Sep. 2017), 1419–1443. https://doi.org/10.1007/s10618-017-0524-z.
[24]
Latino, D., Wicker, J., Gütlein, M., Schmid, E., Kramer, S. and Fenner, K. 2017. Eawag-soil in envipath: a new resource for exploring regulatory pesticide soil biodegradation pathways and half-life data. Environmental science: Process & impact. (Jan. 2017). https://doi.org/10.1039/C6EM00697C.
[25]
Wicker, J., Lorsbach, T., Gütlein, M., Schmid, E., Latino, D., Kramer, S. and Fenner, K. 2016. Envipath - the environmental contaminant biotransformation pathway resource. Nucleic acid research. 44, D1 (Jan. 2016), D502–D508. https://doi.org/10.1093/nar/gkv1229.
[26]
Williams, J., Stönner, C., Wicker, J., Krauter, N., Derstorff, B., Bourtsoukidis, E., Klüpfel, T. and Kramer, S. 2016. Cinema audiences reproducibly vary the chemical composition of air during films, by broadcasting scene specific emissions on breath. Scientific reports. 6, (Jan. 2016). https://doi.org/10.1038/srep25464.
[27]
Hardy, B., Douglas, N., Helma, C., Rautenberg, M., Jeliazkova, N., Jeliazkov, V., Nikolova, I., Benigni, R., Tcheremenskaia, O., Kramer, S., Girschick, T., Buchwald, F., Wicker, J., Karwath, A., Gütlein, M., Maunz, A., Sarimveis, H., Melagraki, G., Afantitis, A., Sopasakis, P., Gallagher, D., Poroikov, V., Filimonov, D., Zakharov, A., Lagunin, A., Gloriozova, T., Novikov, S., Skvortsova, N., Druzhilovsky, D., Chawla, S., Ghosh, I., Ray, S., Patel, H. and Escher, S. 2010. Collaborative development of predictive toxicology applications. Journal of cheminformatics. 2, 1 (Jan. 2010), 7. https://doi.org/10.1186/1758-2946-2-7.
[28]
Wicker, J., Fenner, K., Ellis, L., Wackett, L. and Kramer, S. 2010. Predicting biodegradation products and pathways: a hybrid knowledge- and machine learning-based approach. Bioinformatics. 26, 6 (Jan. 2010), 814–821. https://doi.org/10.1093/bioinformatics/btq024.

Conference papers

[29]
Dost, K., Albrecht, S., MacLean, P., Wicker, J. and Gupta, S. 2025. Understanding rumen methanogen interactions in sheep using machine learning. Lecture notes in computer science (Oct. 2025), 253–269.
[30]
Albrecht, S., Kim, A., Madelino, J., Dost, K., Zhu, J., Broderick, D., Poonawala-Lohani, N., Jamison, S., Rasanathan, D., Stanley, A., Lawrence, S., Marsh, S., Castelino, L., Trenholme, A., Turner, N., McIntyre, P., Paynter, J., Riddle, P., Grant, C., Wicker, J. and Dobbie, G. 2025. How does a gpt perform in forecasting severe respiratory disease hospitalizations? 31Th international conference on neural information processing (iconip) (Mar. 2025).
[31]
Park, S., Wicker, J. and Dost, K. 2025. Resource-constrained binary image classification. Discovery science (Cham, Jan. 2025), 215–230.
[32]
Graffeuille, O., Koh, Y.S., Wicker, J. and Lehmann, M. 2024. Remote sensing for water quality: A multi-task, metadata-driven hypernetwork approach. Proceedings of the thirty-third international joint conference on artificial intelligence (ijcai-24) (Aug. 2024), Pages 7287–7295.
[33]
Kim, J., Urschler, M., Riddle, P. and Wicker, J. 2024. Attacking the loop: Adversarial attacks on graph-based loop closure detection. Proceedings of the 19th international joint conference on computer vision, imaging and computer graphics theory and applications (Feb. 2024), 90–97.
[34]
Pullar-Strecker, Z., Chang, X., Brydon, L., Ziogas, I., Dost, K. and Wicker, J. 2023. Memento: Facilitating effortless, efficient, and reliable ml experiments. Machine learning and knowledge discovery in databases: Applied data science and demo track (Cham, Sep. 2023), 310–314.
[35]
Chang, L., Dost, K., Zhai, K., Demontis, A., Roli, F., Dobbie, G. and Wicker, J. 2023. Baard: Blocking adversarial examples by testing for applicability, reliability and decidability. The 27th pacific-asia conference on knowledge discovery and data mining (pakdd) (Cham, May 2023), 3–14.
[36]
Chen, Z., Dost, K., Zhu, X., Chang, X., Dobbie, G. and Wicker, J. 2023. Targeted attacks on time series forecasting. The 27th pacific-asia conference on knowledge discovery and data mining (pakdd) (Cham, May 2023), 314–327.
[37]
Kim, J., Urschler, M., Riddle, P. and Wicker, J. 2022. Closing the loop: Graph networks to unify semantic objects and visual features for multi-object scenes. 2022 Ieee/rsj international conference on intelligent robots and systems (iros 2022) (Oct. 2022), 4352–4358.
[38]
[39]
Graffeuille, O., Koh, Y.S., Wicker, J. and Lehmann, M. 2022. Semi-supervised conditional density estimation with wasserstein laplacian regularisation. Proceeding of the thirty-sixth aaai conference on artificial intelligence (Jun. 2022), 6746–6754.
[40]
Dost, K., Duncanson, H., Ziogas, I., Riddle, P. and Wicker, J. 2022. Divide and imitate: Multi-cluster identification and mitigation of selection bias. 26Th pacific-asia conference on knowledge discovery and data mining (pakdd2022) (Berlin, Heidelberg, May 2022), 149–160.
[41]
Poonawala-Lohani, N., Riddle, P., Adnan, M. and Wicker, J. 2022. A novel approach for time series forecasting of influenza-like illness using a regression chain method. Pacific symposium on biocomputing (Jan. 2022), 301–312.
[42]
Kim, J., Urschler, M., Riddle, P. and Wicker, J. 2021. Symbiolcd: Ensemble-based loop closure detection using cnn-extracted objects and visual bag-of-words. 2021 Ieee/rsj international conference on intelligent robots and systems (iros) (Sep. 2021), 5425.
[43]
Chester, A., Koh, Y.S., Wicker, J., Sun, Q. and Lee, J. 2020. Balancing utility and fairness against privacy in medical data. Ieee symposium series on computational intelligence (ssci) (Dec. 2020), 1226–1233.
[44]
Dost, K., Taskova, K., Riddle, P. and Wicker, J. 2020. Your best guess when you know nothing: Identification and mitigation of selection bias. 2020 Ieee international conference on data mining (icdm) (Nov. 2020), 996–1001.
[45]
Roeslin, S., Ma, Q., Chigullapally, P., Wicker, J. and Wotherspoon, L. 2020. Feature engineering for a seismic loss prediction model using machine learning, christchurch experience. 17Th world conference on earthquake engineering (Sep. 2020).
[46]
Wicker, J., Hua, Y.C., Rebello, R. and Pfahringer, B. 2019. Xor-based boolean matrix decomposition. 2019 Ieee international conference on data mining (icdm) (Nov. 2019), 638–647.
[47]
Roeslin, S., Ma, Q., Wicker, J. and Wotherspoon, L. 2019. Data integration for the development of a seismic loss prediction model for residential buildings in new zealand. Machine learning and knowledge discovery in databases (Cham, Sep. 2019), 88–100.
[48]
Williams, J., Stönner, C., Edtbauer, A., Derstorff, B., Bourtsoukidis, E., Klüpfel, T., Krauter, N., Wicker, J. and Kramer, S. 2019. What can we learn from the air chemistry of crowds? 8Th international conference on proton transfer reaction mass spectrometry and its applications (Innsbruck, May 2019), 121–123.
[49]
Stönner, C., Edtbauer, A., Derstorff, B., Bourtsoukidis, E., Klüpfel, T., Wicker, J. and Williams, J. 2018. Investigating human emissions of volatile organic compounds in a cinema, flux rates, links to scene content, and possible applications. 15Th conference of the international society of indoor air quality and climate, indoor air 2018 (Jul. 2018).
[50]
Wicker, J., Tyukin, A. and Kramer, S. 2016. A nonlinear label compression and transformation method for multi-label classification using autoencoders. The 20th pacific asia conference on knowledge discovery and data mining (pakdd) (Switzerland, Apr. 2016), 328–340.
[51]
Raza, A., Wicker, J. and Kramer, S. 2016. Trading off accuracy for efficiency by randomized greedy warping. Proceedings of the 31st annual acm symposium on applied computing (New York, NY, USA, Jan. 2016), 883–890.
[52]
Tyukin, A., Kramer, S. and Wicker, J. 2015. Scavenger - a framework for the efficient evaluation of dynamic and modular algorithms. Machine learning and knowledge discovery in databases (Jan. 2015), 325–328.
[53]
Wicker, J., Krauter, N., Derstorff, B., Stönner, C., Bourtsoukidis, E., Klüpfel, T., Williams, J. and Kramer, S. 2015. Cinema data mining: The smell of fear. Proceedings of the 21st acm sigkdd international conference on knowledge discovery and data mining (New York, NY, USA, Jan. 2015), 1235–1304.
[54]
Tyukin, A., Kramer, S. and Wicker, J. 2014. Bmad – a boolean matrix decomposition framework. Machine learning and knowledge discovery in databases (Jan. 2014), 481–484.
[55]
Wicker, J., Pfahringer, B. and Kramer, S. 2012. Multi-label classification using boolean matrix decomposition. Proceedings of the 27th annual acm symposium on applied computing (Jan. 2012), 179–186.
[56]
Richter, L., Wicker, J., Kessler, K. and Kramer, S. 2008. An inductive database and query language in the relational model. Proceedings of the 11th international conference on extending database technology: Advances in database technology (Jan. 2008), 740–744.
[57]
Wicker, J., Brosdau, C., Richter, L. and Kramer, S. 2008. Sindbad sails: A service architecture for inductive learning schemes. Proceedings of the first workshop on third generation data mining: Towards service-oriented knowledge discovery (Jan. 2008).
[58]
Wicker, J., Fenner, K., Ellis, L., Wackett, L. and Kramer, S. 2008. Machine learning and data mining approaches to biodegradation pathway prediction. Proceedings of the second international workshop on the induction of process models at ecml pkdd 2008 (Jan. 2008).
[59]
Wicker, J., Richter, L., Kessler, K. and Kramer, S. 2008. Sindbad and siql: An inductive database and query language in the relational model. Machine learning and knowledge discovery in databases (Jan. 2008), 690–694.
[60]
Kramer, S., Aufschild, V., Hapfelmeier, A., Jarasch, A., Kessler, K., Reckow, S., Wicker, J. and Richter, L. 2006. Inductive databases in the relational model: The data as the bridge. Knowledge discovery in inductive databases (Jan. 2006), 124–138.

Preprints

[61]
Cheena, A., Dost, K., Sarris, T., Straathof, N. and Wicker, J. 2025. Don\\’t swim in data: Real-time microbial forecasting for new zealand recreational waters. SRRN.
[62]
Dost, K., Muraoka, K., Ausseil, A.-G., Benavidez, R., Blue, B., Coland, N., Daughney, C., Semadeni-Davies, A., Hoang, L., Hooper, A., Kpodonu, T.A., Marapara, T., McDowell, R.W., Nguyen, T., Nguyet, D.A., Norton, N., Özkundakci, D., Pearson, L., Rolinson, J., Smith, R., Stephens, T., Tamepo, R., Taylor, K., van Uitregt, V., Jackson, B., Sarris, T., Elliott, A. and Wicker, J. 2025. Freshwater quality modeling in aotearoa new zealand: Current practice and future directions. Ssrn. SSRN.
[63]
Graffeuille, O., Koh, Y.S., Wicker, J. and Lehmann, M. 2024. Enabling asymmetric knowledge transfer in multi-task learning with self-auxiliaries. Arxiv.
[64]
Graffeuille, O., Lehmann, M., Allan, M., Wicker, J. and Koh, Y.S. 2024. Lake by lake, globally: Enhancing water quality remote sensing with multi-task learning models.
[65]
Dost, K., Tam, J., Lorsbach, T., Schmidt, S. and Wicker, J. 2023. Defining applicability domain in biodegradation pathway prediction.
[66]
Chang, X., Dost, K., Dobbie, G. and Wicker, J. 2023. Poison is not traceless: Fully-agnostic detection of poisoning attacks.
[67]
Chang, X., Dobbie, G. and Wicker, J. 2023. Fast adversarial label-flipping attack on tabular data.

Dissertation

[68]
Wicker, J. 2013. Large classifier systems in bio- and cheminformatics. Technische Universität München.

Miscellaneous

[69]
Chang, X., Brydon, L. and Wicker, J. 2024. Memento: v1.1.1. Zenedo.
[70]
Lorsbach, T. and Wicker, J. 2024. Envipath-python: v0.2.3. Zenedo.
[71]
Stönner, C., Edtbauer, A., Derstorff, B., Bourtsoukidis, E., Klüpfel, T., Wicker, J. and Williams, J. 2023. Cinema experiments 2015.
[72]
Wicker, J., Krauter, N., Derstorff, B., Stönner, C., Bourtsoukidis, E., Klüpfel, T., Williams, J. and Kramer, S. 2023. Cinema experiments 2013.
[73]
Kim, J., Urschler, M., Riddle, P. and Wicker, J. 2022. Symbiolcd - datasets. data set.

Supervision

PhD

Lewis Msasa
Groundwater Health Index (GHI): A Data Driven Approach to Climate-Resilient Water Management – starting in 2026
Alexander Bikeyev
Molecule Generation through Adversarial Learning – since 11/2025
Mark Chen
Adversarial Attacks on Time Series – UoA – since 03/2023
Ioannis Ziogas
Machine Learning Models for Rare Event Data: Applications, Limitations, and Performance – since 02/2023
Annie Lu
Machine Learning in Longitudinal Studies – since 05/2020
Nooriyan Poonawala-Lohani
Predictive Analytics for Early Warning of Influenza-like Illness – since 02/2019

Xinglong (Luke) Chang
Adversarial Learning – 10/2019-03/2025
Jonathan Kim
Towards Robust Semantic Scene Understanding through Joint Optimisation of Visual SLAM and Deep Convolutional Neural Networks – 01/2019–06/2025
Katharina Dost
Selection Bias – Identification and Mitigation with no Ground Truth Information – 07/2019-08/2023

Co-supervisor

Qinwen Yang
Addressing the Challenges of Knowledge Discovery Using Machine Learning Methods – since 11/2023
Run Luo
A Combinatorial Chemistry and Deep Learning Method for Environmental Effect Analysis of Nanopesticides – since 12/2021
Olivier Graffeuille
Machine Learning for Extreme Event Detection – 08/2020-08/2025
Cathy Hua
Query-Focused, Analysis-Friendly Text Summarisation of Survey Responses – since 02/2022
Xuan (Johnny) Zhu
A mathematical model guided machine learning method for understanding epidemic multiple wave mechanisms – since 01/2021

Master of Science

Irsyaad Rijwan
Predictiting biodegradation pathways – 2025/2026

Asif Cheena
Don’t Swim in Data: Real-Time Microbial Forecasting for New Zealand Recreational Waters – 2024
Marrick Lip
A Machine Learning Framework for the Analysis of Bat Calls – 2022

Co-supervisor

Andrew Chester
Detecting Bias in Machine Learning Algorithms: End to End De-identification Framework for Clinical Text – 2020

Honours

Saurav Krishnakumar
Evaluation of Pollutant Pathway Prediction Models – S1/S2 2025
Vince Guan
Designing Chemical Compounds Using Adversarial Learning – S1/S2 2025
Sam Chen
Adversarial Attacks on Clustering Algorithms – S2 2022/S1 2023
Liam Brydon
Finding Patterns in Chemical Reactions – S1/S2 2022
Maxwell Zhu
Machine Learning Matching Algorithms in Dating Platforms – S1/S2 2022
Viaan Saunderson
Adversarial Attacks on Graphs – S1/S2 2022
Zac Pullar-Strecker
enviPath – S1/S2 2022
Chong Chuah
Bias in Machine Learning –S1/S2 2022
Mark Chen
Adversarial Learning on Time Series – S2 2021/S1 2022
Hamish Duncanson
IMITATE: Identification and Mitigation of Selection Bias – S1 2021
Milan Law
Data Analysis of COVID-19 Data Sets – S1/S2 2020
Kitty Li
Mining the RDF Graph to Improve the Performance of Classifiers – S1/S2 2019

Engineering Part 4 Projects

Emma Wang
Streamlining adversarial machine learning on Memento – S1/S2 2024
Lina Yuan
Streamlining adversarial machine learning on Memento – S1/S2 2024
Emily Zou
Connecting Adversarial Learning and Applicability Domain in Cheminformatics – S1/S2 2024
Lee Violet Ong
Connecting Adversarial Learning and Applicability Domain in Cheminformatics – S1/S2 2024
Clemen Sun
Poison is Not Traceless: Fully-Agnostic Detection of Poisoning Attacks – S1/S2 2024
Eugene Chua
Poison is Not Traceless: Fully-Agnostic Detection of Poisoning Attacks – S1/S2 2024
Hannah Zhang
A dating platform for interpersonal relationship research – S1/S2 2023
Lang Cheng
A dating platform for interpersonal relationship research – S1/S2 2023

Co-supervisor

Angela Hollings
Ear, nose, and throat app development – S1/S2 2021
Elizabeth Yap
Ear, nose, and throat app development – S1/S2 2021

Masters of Data Science

Charlie Chen
AI and Freshwater Modelling – 07/2023-11/2023
Tsz Fung Ip
Finding Patterns in Recordings of Bat Calls – S2 2021/S1 2022
Xianzhong Li
Inference of Cluster Information based on Unlabelled Datasets – S1 2021
Xiao Li
Prediction and Analysis of Earthquake Mainshocks in the Ring of Fire by Machine Learning Approaches – S1 2021
Yuanchi Ma
Inference of Cluster Information based on Unlabelled Datasets – S1 2021
Zhe Wu
Aftershocks Predictions Following a Major Earthquake in the Ring of Fire Region – S1 2021
Bruno Naveen Joswa
Identififying and Analysing Bat Calls – S2 2020
Mary Grace De la Pena
Machine Learning-based Prediction of Biodegradation Persistence – S2 2020
Josh Bensemann
Change Mining in the Smell of Fear Data Set – S2 2019
Owen Meyer
Analysis of the CARIBIC Data Set – S2 2019
Charles Tremlett
Generating Chemical Structures and Improving Models using Reinforcement Learning – S1/S2 2019
Catherine Liu
Dynamic Pricing – S2 2018/S1 2019
Loukas Lyden
Modelling User Behaviour in Online Shopping – S2 2018/S1 2019
Masoumeh Shariat
Analysis of Petrol related VOCs in the CARIBIC Data Set – S2 2018/S1 2019
Samantha Cen
Identifying Contrails in the CARIBIC Data Set – S2 2018/S1 2019
Ziqing Yan
A New Field of Data Mining: Classification of Movies based on VOCs – S1 2018

CS380 projects

Yena Ahn
Auditing Artificial Intelligence with Adversarial Learning: Meta-Learning – S1 2023
Jonathan Leung
Model Response to Electroconvulsive Therapy Changes based on EEG Traces – S2 2021
Hasnain Cheena
Machine Learning Approaches for Mass Spectrometry Data Analysis – S1 2020
Cathy Hua
Machine Learning Analysis of Student Feedback – S1 2020
Chloe Haigh
Biodegradation Half-Life Prediction – S1 2020
Aryan Lobie
Weather Prediction using Deep Neural Networks – S2 2019
Sichun (Victor) Yin
Advanced Boolean Matrix Decomposition – S1 2018
Tom Février
Identifying Markers for Human Emotion in Breath Using Convolutional Autoencoders on Movie Data – S1 2018

Summer Scholarships

Kevin Zou
Adversarial Time Series – 2025
Maishi Huang
Machine Learning-based Pretiction of the Environmental Fate of Pollutants – 2025

Mihnea Vlad
Adversarial Time Series – 2024
Saurav Krishnakumar
Predicting Persistence of Environmental Pollutants – 2024
Liam Brydon
Finding Patterns in Chemical Reactions – 2022
Ryan La
Auditing Machine Learning Models: Quantifying Reliability using Adversarial Regions – 2022
Sarah Kim
Identifying and analysing bat calls – 2022
Yuye Zhang
Auditing Machine Learning Models: Quantifying Reliability using Adversarial Regions – 2022
Zac Pullar-Strecker
Adversarial Active Learning – 2021
Chloe Haigh
Privacy Defense – 2020
Matthew Mulvey
Machine Learning in the Analysis of Mass Spectrometry Data – 2020
Cathy Hua
Advanced Methods for Boolean Matrix Decomposition – 2019
Hasnain Cheena
The Smell of Fear – 2019
Rayner Rebello
Advanced Methods for Boolean Matrix Decomposition – 2019

Master of Information Technology

Dao (Robin) Gu
The Great Unmatched – 2023
Qiong Zhou
Fast & Accurate Chromosome Assembly – Streaming curation using image-based methods – 2023
Dingguan Lyu
Business Analyst – 2022
Aditio Nugroho
Proof of Concept – Salesforce Utility Cloud – 2021
Hiu Wing Doris
Proof of Concept – Salesforce Utility Cloud – 2021
Shakeel Khan
Proof of Concept – Salesforce Utility Cloud – 2021
Shriya Sadhu
Proof of Concept – Salesforce Utility Cloud – 2021
Hongnan Dou
Crash Prediction – 2020
Jiangning Lin
Crash Prediction – 2020
Wenjie Xu
Crash Prediction – 2020
Xiangli (Ben) Cheng
Crash Prediction – 2020
Catherine Blandin De Chalain
Managed Service Customer Portal – 2019
Pradeep Kumar
Brave New Coin – 2018
Xeshu Shen
AT Camera Testing – 2018
Ning Hua
Explainability in Medical Models – 2017
Samil Farouqui
Ad Bidding – 2017

Diplom (equivalent to M.Sc. – at Johannes Gutenberg University Mainz and Technical University Munich, Germany)

Christian Sußenberger
Predicting Toxicity of Biodegradation Products Using REST – JGU – 2014

Co-supervisor

Katharina Dost
Boolean Matrix Decomposition for Giant Matrices – JGU –2016
  • Award for Outstanding Master's Thesis – Faculty for Physics, Mathematics, and Computer Science
Steffen Albrecht
Data Mining for The Cancer Genome Atlas – JGU – 2016
Christoph Brosdau
Service Oriented Data Mining for Biological Data – TUM – 2008
Daniela Bieley
Integration of String Mining in an Inductive Database – TUM – 2008

B.Sc. (equivalent to Honours – co-supervisor at Johannes Gutenberg University Mainz and Technical University of Munich, Germany)

Steven Lang
Predicting the Persistence of Environmental Pollutants – 2017
Nicolas Krauter
Understanding a Search Engine's User Categorization – 2016
Tim Lorsbach
Optimizing the Coherence of Binary Relevance at Prediction Time – 2016
Konstantinos Katikakis
An Empirical Review on Modular Boolean Matrix Decomposition – 2015
Andrey Tyukin
A Framework for Parallel Computation in Data Mining and its Application to Autoencoders – 2015
Florian Seifert
Data Mining in Medical Databases – TUM – 2012
Sebastian Lehnerer
Feature and Label-Selection for Multi-Label Classification – TUM – 2011
Julian Lemke
Data Imputation for Multi-Label Classification – TUM – 2011

Talks

Invited Talks

Advancements in Biotransformation Pathway Prediction: Enhancements, Datasets, and Novel Functionalities in Envipath
European Chemical Agency (2025)
Advancements in Biotransformation Pathway Prediction: Enhancements, Datasets, and Novel Functionalities in enviPath
Technical University of Munich (2025)
Reliable Machine Learning – Methods and Applications in Environmental Sciences
Institut Jozef Stefan (2025)
Advancements in Biotransformation Pathway Prediction: Enhancements, Datasets, and Novel Functionalities in Envipath
Boehringer Ingelheim (2025)
Reliable Machine Learning – Methods and Applications in Environmental Sciences
University of Stavanger (2025)
Reliable Machine Learning – Methods and Applications in Environmental Sciences
INESC TEC - Institute for Systems and Computer Engineering, Technology and Science, Porto (2025)
Advancements in Biotransformation Pathway Prediction: Enhancements, Datasets, and Novel Functionalities in Envipath
Environmental Protection Agency NZ (2024)
Reliable Machine Learning – Methods and Applications in Environmental Sciences
Technical University of Munich (2024)
Reliable Machine Learning – Methods and Applications in Environmental Sciences
RIKEN AJP (2023)
Beyond the Smell of Fear: Data Mining for Atmospheric Chemistry
University of Waikato (2016)
Cinema Data Mining: The Smell of Fear
RMIT University (2016)
enviPath – database and prediction system for the microbial biotransformation of organic environmental contaminants
Biochemical Pathways and Large Scale Metabolic Networks, Swiss Institute for Bioinformatic (2016)
Cinema Data Mining: The Smell of Fear
University of Waikato (2015)
enviPath – Technical Aspects and New Data
enviPath Workshop (2015)
Large Classifier Systems in Bio- and Cheminformatics
University of Waikato (2014)
Machine Learning in Biodegradation Pathway Prediction
RMIT University (2016)
NGE-PPS: The Next-Generation Biotransformation Pathway Prediction System
ATHENE Workshop (2014)
Using Classifier Systems in Biodegradation Pathway Prediction and Predictive Toxicology
Lhasa Ltd. (2014)
SINDBAD and SiQL: an Inductive Database and Query Language in the Relational Model
Institut Jožef Stefan (2009)

Demos

Scavenger – A Framework for the Efficient Evaluation of Dynamic and Modular Algorithms
ECML/PKDD (2015)
BMaD – A Boolean Matrix Decomposition Framework
ECML/PKDD (2014)
An Inductive Database and Query Language in the Relational Model
EDBT (2008)
SINDBAD and SiQL: An Inductive Database and Query Language in the Relational Model
ECML/PKDD (2008)

Conference Talks

XOR-based Boolean Matrix Decomposition
International Conference on Data Mining (2019)
The best Privacy Defence is a good Provacy Offence – Obfuscating a Serch Engine User’s Profile
ECML/PKDD (2017)
A Nonlinear Label Compression and Transformation Method for Multi-Label Classification using Autoencoder
PAKDD (2016)
Cinema Data Mining: The Smell of Fear
KDD (2015)
Multi-Label Classification Using Boolean Matrix Decomposition
ACM SAC Data Mining Track (2012)
Predicting Biodegradation Products and Pathways: A Hybrid Knowledge-Based and Machine Learning-Based Approach
TransCon (2010)
SINDBAD SAILS: A Service Architecture for Inductive Learning Schemes
First Workshop on Third Generation Data Mining: Towards Service-Oriented Knowledge Discovery (2008)
Machine Learning and Data Mining Approaches to Biodegradation Pathway Prediction
International Workshop on the Induction of Process Model at ECML/PKDD (2008)

Posters

Advancements in Biotransformation Pathway Prediction: Enhancements, Datasets, and Novel Functionalities in Envipath
Symposium on Pesticide Chemistry (2024)
Advanced Multi-Label Classification Methods for the Prediction of ToxCast Endpoints
OpenTox Euro Meeting (2013)
Extending the Prediction of the Environmental Fate of Chemicals Using REST Webservices
OpenTox Euro Meeting (2013)
An extensive Multi-Label Analysis of the ToxCast Data Set
OpenTox Euro Meeting (2011)
Multi-Relational Learning by Boolean Matrix Factorization and Multi-Label Classification
International Conference on Inductive Logic Programming (2011)
An Extensive Multi-Label Analysis of the ToxCast Data Set
European Symposium on Quantitative Structure-Activity Relationship (2010)

Teaching

CS101
Introduction to Programming
  • SS 2019
CS351
Fundamentals of Database Systems
  • S1 2018, S1 2019 (coordinator)
CS361
Introduction to Machine Learning
  • S1 2019, S1 2020, S1 2021, S1 2023, S1 2024, S1 2025 (coordinator)
CS361 (remotely at NEFU)
Introduction to Machine Learning
  • S1 2022
CS751
Advanced Topics in Database Systems
  • S1 2018, S1 2019 (coordinator)
CS760
Data Mining and Machine Learning
  • S1 2018, S2 2018 (coordinator), S1 2019, S2 2019 (coordinator), S1 2020 (coordinator), S2 2020, S2 2023
CS762
Advanced Machine Learning
  • S1 2020, S1 2021 (coordinator), S1 2022, S1 2023, S1 2024, S1 2025 (coordinator)
Software Engineering
Johannes Gutenberg University Mainz
  • 2016
Data Mining
Johannes Gutenberg University Mainz
  • 2016

Service

Programm Committee

since 2017
AAAI - Conference on Artificial Intelligence
since 2019
ECAI - European Conference on Artificial Intelligence
since 2013
ECML / PKDD - European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases
since 2020
ICDM - International Conference on Data Mining
since 2017
IJCAI - International Joint Conference on Artificial Intelligence
since 2021
KDD - SIGKDD Conference on Knowledge Discovery and Data Mining
since 2018
PAKDD - Pacific-Asia Conference on Knowledge Discovery and Data Mining

Reviewing

  • Atmospheric Chemistry and Physics (Copernicus Publications)
  • Computers in Biology and Medicine (Elsevier)
  • Conference on Computer Vision and Pattern Recognition (CVPR)
  • Environmental Science: Processes & Impacts (Royal Society of Chemistry)
  • Nature Communications (Nature)
  • Pattern Recognition (Elsevier)
  • Pattern Recognition Letters (Elsevier)

Organising Committee

Local Organizer

  • MLSB – International Workshop on Machine Learning in Systems Biology – 2011
  • OpenTox – Innovation in Predictive Toxicology: OpenTox InterAction Meeting – 2011
  • ICDM – International Conference on Data Mining – 2021