The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
La recherche sur des exemples contradictoires pour l’apprentissage automatique a reçu beaucoup d’attention ces dernières années. La plupart des approches précédentes sont des attaques en boîte blanche ; cela signifie que l'attaquant doit obtenir au préalable les paramètres internes d'un classificateur cible pour générer des exemples contradictoires pour celui-ci. Cette condition est difficile à satisfaire en pratique. Il existe également des recherches sur les attaques par boîte noire, dans lesquelles l'attaquant ne peut obtenir que des informations partielles sur les classificateurs cibles ; cependant, il semble que nous puissions empêcher ces attaques, car elles doivent émettre de nombreuses requêtes suspectes au classificateur cible. Dans cet article, nous montrons qu’une stratégie de défense naïve basée sur la surveillance des requêtes numériques ne suffira pas. Plus concrètement, nous proposons de générer des perturbations contradictoires non pas par pixel mais par bloc pour réduire le nombre de requêtes. Nos expériences montrent que de telles perturbations brutales peuvent perturber le classificateur cible. Nous réussissons à réduire le nombre de requêtes pour générer des exemples contradictoires dans la plupart des cas. Notre méthode simple est une attaque non ciblée et peut avoir de faibles taux de réussite par rapport aux résultats précédents d'autres attaques par boîte noire, mais nécessite en moyenne moins de requêtes. Étonnamment, le nombre minimum de requêtes (une et trois dans les ensembles de données MNIST et CIFAR-10, respectivement) est suffisant pour générer des exemples contradictoires dans certains cas. De plus, sur la base de ces résultats, nous proposons une classification détaillée des attaquants boîte noire et discutons des contre-mesures contre les attaques ci-dessus.
Yuya SENZAKI
Idein Inc.
Satsuya OHATA
National Institute of Advanced Industrial Science and Technology (AIST)
Kanta MATSUURA
The University of Tokyo
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copier
Yuya SENZAKI, Satsuya OHATA, Kanta MATSUURA, "Simple Black-Box Adversarial Examples Generation with Very Few Queries" in IEICE TRANSACTIONS on Information,
vol. E103-D, no. 2, pp. 212-221, February 2020, doi: 10.1587/transinf.2019INP0002.
Abstract: Research on adversarial examples for machine learning has received much attention in recent years. Most of previous approaches are white-box attacks; this means the attacker needs to obtain before-hand internal parameters of a target classifier to generate adversarial examples for it. This condition is hard to satisfy in practice. There is also research on black-box attacks, in which the attacker can only obtain partial information about target classifiers; however, it seems we can prevent these attacks, since they need to issue many suspicious queries to the target classifier. In this paper, we show that a naive defense strategy based on surveillance of number query will not suffice. More concretely, we propose to generate not pixel-wise but block-wise adversarial perturbations to reduce the number of queries. Our experiments show that such rough perturbations can confuse the target classifier. We succeed in reducing the number of queries to generate adversarial examples in most cases. Our simple method is an untargeted attack and may have low success rates compared to previous results of other black-box attacks, but needs in average fewer queries. Surprisingly, the minimum number of queries (one and three in MNIST and CIFAR-10 dataset, respectively) is enough to generate adversarial examples in some cases. Moreover, based on these results, we propose a detailed classification for black-box attackers and discuss countermeasures against the above attacks.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2019INP0002/_p
Copier
@ARTICLE{e103-d_2_212,
author={Yuya SENZAKI, Satsuya OHATA, Kanta MATSUURA, },
journal={IEICE TRANSACTIONS on Information},
title={Simple Black-Box Adversarial Examples Generation with Very Few Queries},
year={2020},
volume={E103-D},
number={2},
pages={212-221},
abstract={Research on adversarial examples for machine learning has received much attention in recent years. Most of previous approaches are white-box attacks; this means the attacker needs to obtain before-hand internal parameters of a target classifier to generate adversarial examples for it. This condition is hard to satisfy in practice. There is also research on black-box attacks, in which the attacker can only obtain partial information about target classifiers; however, it seems we can prevent these attacks, since they need to issue many suspicious queries to the target classifier. In this paper, we show that a naive defense strategy based on surveillance of number query will not suffice. More concretely, we propose to generate not pixel-wise but block-wise adversarial perturbations to reduce the number of queries. Our experiments show that such rough perturbations can confuse the target classifier. We succeed in reducing the number of queries to generate adversarial examples in most cases. Our simple method is an untargeted attack and may have low success rates compared to previous results of other black-box attacks, but needs in average fewer queries. Surprisingly, the minimum number of queries (one and three in MNIST and CIFAR-10 dataset, respectively) is enough to generate adversarial examples in some cases. Moreover, based on these results, we propose a detailed classification for black-box attackers and discuss countermeasures against the above attacks.},
keywords={},
doi={10.1587/transinf.2019INP0002},
ISSN={1745-1361},
month={February},}
Copier
TY - JOUR
TI - Simple Black-Box Adversarial Examples Generation with Very Few Queries
T2 - IEICE TRANSACTIONS on Information
SP - 212
EP - 221
AU - Yuya SENZAKI
AU - Satsuya OHATA
AU - Kanta MATSUURA
PY - 2020
DO - 10.1587/transinf.2019INP0002
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E103-D
IS - 2
JA - IEICE TRANSACTIONS on Information
Y1 - February 2020
AB - Research on adversarial examples for machine learning has received much attention in recent years. Most of previous approaches are white-box attacks; this means the attacker needs to obtain before-hand internal parameters of a target classifier to generate adversarial examples for it. This condition is hard to satisfy in practice. There is also research on black-box attacks, in which the attacker can only obtain partial information about target classifiers; however, it seems we can prevent these attacks, since they need to issue many suspicious queries to the target classifier. In this paper, we show that a naive defense strategy based on surveillance of number query will not suffice. More concretely, we propose to generate not pixel-wise but block-wise adversarial perturbations to reduce the number of queries. Our experiments show that such rough perturbations can confuse the target classifier. We succeed in reducing the number of queries to generate adversarial examples in most cases. Our simple method is an untargeted attack and may have low success rates compared to previous results of other black-box attacks, but needs in average fewer queries. Surprisingly, the minimum number of queries (one and three in MNIST and CIFAR-10 dataset, respectively) is enough to generate adversarial examples in some cases. Moreover, based on these results, we propose a detailed classification for black-box attackers and discuss countermeasures against the above attacks.
ER -