The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
Les détecteurs d'objets récents les plus performants dépendent généralement d'une approche en deux étapes, qui bénéficie de sa proposition de région et de sa pratique de raffinement, mais souffre d'une faible vitesse de détection. En revanche, les approches en une étape présentent l’avantage d’une grande efficacité tout en sacrifiant dans une certaine mesure leur précision. Dans cet article, nous proposons un nouveau réseau de détection d'objets à tir unique qui hérite des mérites des deux. Motivés par l'idée d'enrichissement sémantique des caractéristiques convolutionnelles au sein d'un détecteur profond typique, nous proposons deux nouveaux modules : 1) en modélisant les interactions sémantiques entre les canaux et les dépendances à longue portée entre les positions spatiales, le module autonome génère les deux canaux et positionner l'attention, et améliorer les caractéristiques convolutionnelles originales de manière autoguidée ; 2) en tirant parti de la capacité de localisation discriminante de classe du CNN formé à la classification, le module d'activation sémantique apprend une réponse convolutive sémantique significative qui augmente les caractéristiques convolutives de bas niveau avec de fortes informations sémantiques spécifiques à la classe. Le réseau dit d'activation automatique et sémantique (ASAN) atteint une meilleure précision que les méthodes en deux étapes et est capable d'effectuer un traitement en temps réel. Des expériences approfondies sur PASCAL VOC indiquent qu'ASAN atteint des performances de détection de pointe avec une efficacité élevée.
Xinyu ZHU
Fudan University
Jun ZHANG
Fudan University
Gengsheng CHEN
Fudan University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copier
Xinyu ZHU, Jun ZHANG, Gengsheng CHEN, "ASAN: Self-Attending and Semantic Activating Network towards Better Object Detection" in IEICE TRANSACTIONS on Information,
vol. E103-D, no. 3, pp. 648-659, March 2020, doi: 10.1587/transinf.2019EDP7164.
Abstract: Recent top-performing object detectors usually depend on a two-stage approach, which benefits from its region proposal and refining practice but suffers low detection speed. By contrast, one-stage approaches have the advantage of high efficiency while sacrifice their accuracies to some extent. In this paper, we propose a novel single-shot object detection network which inherits the merits of both. Motivated by the idea of semantic enrichment to the convolutional features within a typical deep detector, we propose two novel modules: 1) by modeling the semantic interactions between channels and the long-range dependencies between spatial positions, the self-attending module generates both channel and position attention, and enhance the original convolutional features in a self-guided manner; 2) leveraging the class-discriminative localization ability of classification-trained CNN, the semantic activating module learns a semantic meaningful convolutional response which augments low-level convolutional features with strong class-specific semantic information. The so called self-attending and semantic activating network (ASAN) achieves better accuracy than two-stage methods and is able to fulfil real-time processing. Comprehensive experiments on PASCAL VOC indicates that ASAN achieves state-of-the-art detection performance with high efficiency.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2019EDP7164/_p
Copier
@ARTICLE{e103-d_3_648,
author={Xinyu ZHU, Jun ZHANG, Gengsheng CHEN, },
journal={IEICE TRANSACTIONS on Information},
title={ASAN: Self-Attending and Semantic Activating Network towards Better Object Detection},
year={2020},
volume={E103-D},
number={3},
pages={648-659},
abstract={Recent top-performing object detectors usually depend on a two-stage approach, which benefits from its region proposal and refining practice but suffers low detection speed. By contrast, one-stage approaches have the advantage of high efficiency while sacrifice their accuracies to some extent. In this paper, we propose a novel single-shot object detection network which inherits the merits of both. Motivated by the idea of semantic enrichment to the convolutional features within a typical deep detector, we propose two novel modules: 1) by modeling the semantic interactions between channels and the long-range dependencies between spatial positions, the self-attending module generates both channel and position attention, and enhance the original convolutional features in a self-guided manner; 2) leveraging the class-discriminative localization ability of classification-trained CNN, the semantic activating module learns a semantic meaningful convolutional response which augments low-level convolutional features with strong class-specific semantic information. The so called self-attending and semantic activating network (ASAN) achieves better accuracy than two-stage methods and is able to fulfil real-time processing. Comprehensive experiments on PASCAL VOC indicates that ASAN achieves state-of-the-art detection performance with high efficiency.},
keywords={},
doi={10.1587/transinf.2019EDP7164},
ISSN={1745-1361},
month={March},}
Copier
TY - JOUR
TI - ASAN: Self-Attending and Semantic Activating Network towards Better Object Detection
T2 - IEICE TRANSACTIONS on Information
SP - 648
EP - 659
AU - Xinyu ZHU
AU - Jun ZHANG
AU - Gengsheng CHEN
PY - 2020
DO - 10.1587/transinf.2019EDP7164
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E103-D
IS - 3
JA - IEICE TRANSACTIONS on Information
Y1 - March 2020
AB - Recent top-performing object detectors usually depend on a two-stage approach, which benefits from its region proposal and refining practice but suffers low detection speed. By contrast, one-stage approaches have the advantage of high efficiency while sacrifice their accuracies to some extent. In this paper, we propose a novel single-shot object detection network which inherits the merits of both. Motivated by the idea of semantic enrichment to the convolutional features within a typical deep detector, we propose two novel modules: 1) by modeling the semantic interactions between channels and the long-range dependencies between spatial positions, the self-attending module generates both channel and position attention, and enhance the original convolutional features in a self-guided manner; 2) leveraging the class-discriminative localization ability of classification-trained CNN, the semantic activating module learns a semantic meaningful convolutional response which augments low-level convolutional features with strong class-specific semantic information. The so called self-attending and semantic activating network (ASAN) achieves better accuracy than two-stage methods and is able to fulfil real-time processing. Comprehensive experiments on PASCAL VOC indicates that ASAN achieves state-of-the-art detection performance with high efficiency.
ER -