The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
L’apprentissage profond moderne a considérablement amélioré les performances et a été utilisé dans une grande variété d’applications. Étant donné que la quantité de calcul requise pour le processus d'inférence du réseau neuronal est importante, elle est traitée non pas par le lieu d'acquisition des données comme une caméra de surveillance, mais par le serveur doté d'une puissance de calcul abondante installé dans le centre de données. L’Edge Computing fait l’objet d’une attention considérable pour résoudre ce problème. Cependant, l’edge computing peut fournir des ressources de calcul limitées. Par conséquent, nous avons supposé un modèle de réseau neuronal divisé/distribué utilisant à la fois le périphérique périphérique et le serveur. En traitant une partie de la couche de convolution en périphérie, la quantité de communication devient inférieure à celle des données du capteur. Dans cet article, nous avons évalué AlexNet et les huit autres modèles sur l'environnement distribué et estimé les valeurs FPS avec les communications Wi-Fi, 3G et 5G. Pour réduire les coûts de communication, nous avons également introduit le processus de compression avant la communication. Cette compression peut dégrader la précision de la reconnaissance des objets. Comme conditions nécessaires, nous définissons le FPS sur 30 ou plus et la précision de la reconnaissance des objets sur 69.7 % ou plus. Cette valeur est déterminée sur la base de celle d'un modèle d'approximation qui binarise l'activation du réseau neuronal. Nous avons construit des modèles de performances et d'énergie pour trouver la configuration optimale qui consomme un minimum d'énergie tout en satisfaisant les conditions nécessaires. Grâce à une évaluation complète, nous avons constaté que les configurations optimales des neuf modèles. Pour les petits modèles, tels qu'AlexNet, le traitement de modèles entiers en périphérie était le meilleur. En revanche, pour les modèles volumineux, tels que le VGG16, le traitement des modèles entiers sur le serveur était le meilleur. Pour les modèles de taille moyenne, les modèles distribués étaient de bons candidats. Nous avons confirmé que notre modèle a trouvé la configuration la plus économe en énergie tout en satisfaisant aux exigences de FPS et de précision, et les modèles distribués ont réussi à réduire la consommation d'énergie jusqu'à 48.6 % et 6.6 % en moyenne. Nous avons également constaté que la compression HEVC est importante avant de transférer les données d'entrée ou les données de fonctionnalités entre les processus d'inférence distribués.
Ryuta SHINGAI
Nara Institutet of Science and Technology
Yuria HIRAGA
Nara Institutet of Science and Technology
Hisakazu FUKUOKA
Nara Institutet of Science and Technology
Takamasa MITANI
Nara Institutet of Science and Technology
Takashi NAKADA
Nara Institutet of Science and Technology
Yasuhiko NAKASHIMA
Nara Institutet of Science and Technology
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copier
Ryuta SHINGAI, Yuria HIRAGA, Hisakazu FUKUOKA, Takamasa MITANI, Takashi NAKADA, Yasuhiko NAKASHIMA, "Construction of an Efficient Divided/Distributed Neural Network Model Using Edge Computing" in IEICE TRANSACTIONS on Information,
vol. E103-D, no. 10, pp. 2072-2082, October 2020, doi: 10.1587/transinf.2019EDP7326.
Abstract: Modern deep learning has significantly improved performance and has been used in a wide variety of applications. Since the amount of computation required for the inference process of the neural network is large, it is processed not by the data acquisition location like a surveillance camera but by the server with abundant computing power installed in the data center. Edge computing is getting considerable attention to solve this problem. However, edge computing can provide limited computation resources. Therefore, we assumed a divided/distributed neural network model using both the edge device and the server. By processing part of the convolution layer on edge, the amount of communication becomes smaller than that of the sensor data. In this paper, we have evaluated AlexNet and the other eight models on the distributed environment and estimated FPS values with Wi-Fi, 3G, and 5G communication. To reduce communication costs, we also introduced the compression process before communication. This compression may degrade the object recognition accuracy. As necessary conditions, we set FPS to 30 or faster and object recognition accuracy to 69.7% or higher. This value is determined based on that of an approximation model that binarizes the activation of Neural Network. We constructed performance and energy models to find the optimal configuration that consumes minimum energy while satisfying the necessary conditions. Through the comprehensive evaluation, we found that the optimal configurations of all nine models. For small models, such as AlexNet, processing entire models in the edge was the best. On the other hand, for huge models, such as VGG16, processing entire models in the server was the best. For medium-size models, the distributed models were good candidates. We confirmed that our model found the most energy efficient configuration while satisfying FPS and accuracy requirements, and the distributed models successfully reduced the energy consumption up to 48.6%, and 6.6% on average. We also found that HEVC compression is important before transferring the input data or the feature data between the distributed inference processes.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2019EDP7326/_p
Copier
@ARTICLE{e103-d_10_2072,
author={Ryuta SHINGAI, Yuria HIRAGA, Hisakazu FUKUOKA, Takamasa MITANI, Takashi NAKADA, Yasuhiko NAKASHIMA, },
journal={IEICE TRANSACTIONS on Information},
title={Construction of an Efficient Divided/Distributed Neural Network Model Using Edge Computing},
year={2020},
volume={E103-D},
number={10},
pages={2072-2082},
abstract={Modern deep learning has significantly improved performance and has been used in a wide variety of applications. Since the amount of computation required for the inference process of the neural network is large, it is processed not by the data acquisition location like a surveillance camera but by the server with abundant computing power installed in the data center. Edge computing is getting considerable attention to solve this problem. However, edge computing can provide limited computation resources. Therefore, we assumed a divided/distributed neural network model using both the edge device and the server. By processing part of the convolution layer on edge, the amount of communication becomes smaller than that of the sensor data. In this paper, we have evaluated AlexNet and the other eight models on the distributed environment and estimated FPS values with Wi-Fi, 3G, and 5G communication. To reduce communication costs, we also introduced the compression process before communication. This compression may degrade the object recognition accuracy. As necessary conditions, we set FPS to 30 or faster and object recognition accuracy to 69.7% or higher. This value is determined based on that of an approximation model that binarizes the activation of Neural Network. We constructed performance and energy models to find the optimal configuration that consumes minimum energy while satisfying the necessary conditions. Through the comprehensive evaluation, we found that the optimal configurations of all nine models. For small models, such as AlexNet, processing entire models in the edge was the best. On the other hand, for huge models, such as VGG16, processing entire models in the server was the best. For medium-size models, the distributed models were good candidates. We confirmed that our model found the most energy efficient configuration while satisfying FPS and accuracy requirements, and the distributed models successfully reduced the energy consumption up to 48.6%, and 6.6% on average. We also found that HEVC compression is important before transferring the input data or the feature data between the distributed inference processes.},
keywords={},
doi={10.1587/transinf.2019EDP7326},
ISSN={1745-1361},
month={October},}
Copier
TY - JOUR
TI - Construction of an Efficient Divided/Distributed Neural Network Model Using Edge Computing
T2 - IEICE TRANSACTIONS on Information
SP - 2072
EP - 2082
AU - Ryuta SHINGAI
AU - Yuria HIRAGA
AU - Hisakazu FUKUOKA
AU - Takamasa MITANI
AU - Takashi NAKADA
AU - Yasuhiko NAKASHIMA
PY - 2020
DO - 10.1587/transinf.2019EDP7326
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E103-D
IS - 10
JA - IEICE TRANSACTIONS on Information
Y1 - October 2020
AB - Modern deep learning has significantly improved performance and has been used in a wide variety of applications. Since the amount of computation required for the inference process of the neural network is large, it is processed not by the data acquisition location like a surveillance camera but by the server with abundant computing power installed in the data center. Edge computing is getting considerable attention to solve this problem. However, edge computing can provide limited computation resources. Therefore, we assumed a divided/distributed neural network model using both the edge device and the server. By processing part of the convolution layer on edge, the amount of communication becomes smaller than that of the sensor data. In this paper, we have evaluated AlexNet and the other eight models on the distributed environment and estimated FPS values with Wi-Fi, 3G, and 5G communication. To reduce communication costs, we also introduced the compression process before communication. This compression may degrade the object recognition accuracy. As necessary conditions, we set FPS to 30 or faster and object recognition accuracy to 69.7% or higher. This value is determined based on that of an approximation model that binarizes the activation of Neural Network. We constructed performance and energy models to find the optimal configuration that consumes minimum energy while satisfying the necessary conditions. Through the comprehensive evaluation, we found that the optimal configurations of all nine models. For small models, such as AlexNet, processing entire models in the edge was the best. On the other hand, for huge models, such as VGG16, processing entire models in the server was the best. For medium-size models, the distributed models were good candidates. We confirmed that our model found the most energy efficient configuration while satisfying FPS and accuracy requirements, and the distributed models successfully reduced the energy consumption up to 48.6%, and 6.6% on average. We also found that HEVC compression is important before transferring the input data or the feature data between the distributed inference processes.
ER -