2021
1.
Chang, Luke; Dost, Katharina; Zhai, Kaiqi; Demontis, Ambra; Roli, Fabio; Dobbie, Gillian; Wicker, Jörg
Intriguing Usage of Applicability Domain: Lessons from Cheminformatics Applied to Adversarial Learning Journal Article
In: arxiv, vol. 2105.00495, 2021, (preprint).
Abstract | Links | BibTeX | Tags: adversarial learning, cheminformatics, machine learning
@article{chang2021intriguing,
title = {Intriguing Usage of Applicability Domain: Lessons from Cheminformatics Applied to Adversarial Learning},
author = {Luke Chang and Katharina Dost and Kaiqi Zhai and Ambra Demontis and Fabio Roli and Gillian Dobbie and J\"{o}rg Wicker},
url = {https://arxiv.org/abs/2105.00495},
year = {2021},
date = {2021-05-02},
journal = {arxiv},
volume = {2105.00495},
abstract = {Defending machine learning models from adversarial attacks is still a challenge: none of the robust models is utterly immune to adversarial examples to date. Different defences have been proposed; however, most of them are tailored to particular ML models and adversarial attacks, therefore their effectiveness and applicability are strongly limited. A similar problem plagues cheminformatics: Quantitative Structure-Activity Relationship (QSAR) models struggle to predict biological activity for the entire chemical space because they are trained on a very limited amount of compounds with known effects. This problem is relieved with a technique called Applicability Domain (AD), which rejects the unsuitable compounds for the model. Adversarial examples are intentionally crafted inputs that exploit the blind spots which the model has not learned to classify, and adversarial defences try to make the classifier more robust by covering these blind spots. There is an apparent similarity between AD and adversarial defences. Inspired by the concept of AD, we propose a multi-stage data-driven defence that is testing for: Applicability: abnormal values, namely inputs not compliant with the intended use case of the model; Reliability: samples far from the training data; and Decidability: samples whose predictions contradict the predictions of their neighbours. It can be applied to any classification model and is not limited to specific types of adversarial attacks. With an empirical analysis, this paper demonstrates how Applicability Domain can effectively reduce the vulnerability of ML models to adversarial examples. },
note = {preprint},
keywords = {adversarial learning, cheminformatics, machine learning},
pubstate = {published},
tppubtype = {article}
}
Defending machine learning models from adversarial attacks is still a challenge: none of the robust models is utterly immune to adversarial examples to date. Different defences have been proposed; however, most of them are tailored to particular ML models and adversarial attacks, therefore their effectiveness and applicability are strongly limited. A similar problem plagues cheminformatics: Quantitative Structure-Activity Relationship (QSAR) models struggle to predict biological activity for the entire chemical space because they are trained on a very limited amount of compounds with known effects. This problem is relieved with a technique called Applicability Domain (AD), which rejects the unsuitable compounds for the model. Adversarial examples are intentionally crafted inputs that exploit the blind spots which the model has not learned to classify, and adversarial defences try to make the classifier more robust by covering these blind spots. There is an apparent similarity between AD and adversarial defences. Inspired by the concept of AD, we propose a multi-stage data-driven defence that is testing for: Applicability: abnormal values, namely inputs not compliant with the intended use case of the model; Reliability: samples far from the training data; and Decidability: samples whose predictions contradict the predictions of their neighbours. It can be applied to any classification model and is not limited to specific types of adversarial attacks. With an empirical analysis, this paper demonstrates how Applicability Domain can effectively reduce the vulnerability of ML models to adversarial examples.
2017
2.
Wicker, Jörg; Kramer, Stefan
The Best Privacy Defense is a Good Privacy Offense: Obfuscating a Search Engine User's Profile Journal Article
In: Data Mining and Knowledge Discovery, vol. 31, no. 5, pp. 1419-1443, 2017, ISSN: 1573-756X.
Abstract | Links | BibTeX | Altmetric | Tags: adversarial learning, machine learning, personalized ads, privacy, reinforcement learning, search engines
@article{wicker2017best,
title = {The Best Privacy Defense is a Good Privacy Offense: Obfuscating a Search Engine User's Profile},
author = {J\"{o}rg Wicker and Stefan Kramer},
editor = {Kurt Driessens and Dragi Kocev and Marko Robnik-\v{S}ikonja and Myra Spiliopoulou},
url = {http://rdcu.be/tL0U},
doi = {10.1007/s10618-017-0524-z},
issn = {1573-756X},
year = {2017},
date = {2017-09-01},
journal = {Data Mining and Knowledge Discovery},
volume = {31},
number = {5},
pages = {1419-1443},
abstract = {User privacy on the internet is an important and unsolved problem. So far, no sufficient and comprehensive solution has been proposed that helps a user to protect his or her privacy while using the internet. Data are collected and assembled by numerous service providers. Solutions so far focused on the side of the service providers to store encrypted or transformed data that can be still used for analysis. This has a major flaw, as it relies on the service providers to do this. The user has no chance of actively protecting his or her privacy. In this work, we suggest a new approach, empowering the user to take advantage of the same tool the other side has, namely data mining to produce data which obfuscates the user’s profile. We apply this approach to search engine queries and use feedback of the search engines in terms of personalized advertisements in an algorithm similar to reinforcement learning to generate new queries potentially confusing the search engine. We evaluated the approach using a real-world data set. While evaluation is hard, we achieve results that indicate that it is possible to influence the user’s profile that the search engine generates. This shows that it is feasible to defend a user’s privacy from a new and more practical perspective.},
keywords = {adversarial learning, machine learning, personalized ads, privacy, reinforcement learning, search engines},
pubstate = {published},
tppubtype = {article}
}
User privacy on the internet is an important and unsolved problem. So far, no sufficient and comprehensive solution has been proposed that helps a user to protect his or her privacy while using the internet. Data are collected and assembled by numerous service providers. Solutions so far focused on the side of the service providers to store encrypted or transformed data that can be still used for analysis. This has a major flaw, as it relies on the service providers to do this. The user has no chance of actively protecting his or her privacy. In this work, we suggest a new approach, empowering the user to take advantage of the same tool the other side has, namely data mining to produce data which obfuscates the user’s profile. We apply this approach to search engine queries and use feedback of the search engines in terms of personalized advertisements in an algorithm similar to reinforcement learning to generate new queries potentially confusing the search engine. We evaluated the approach using a real-world data set. While evaluation is hard, we achieve results that indicate that it is possible to influence the user’s profile that the search engine generates. This shows that it is feasible to defend a user’s privacy from a new and more practical perspective.