The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
La reconnaissance de caractères de scène a fait l'objet d'études approfondies depuis une vingtaine d'années car elle présente un grand potentiel dans de nombreuses applications, notamment la traduction automatique, la reconnaissance de panneaux et l'aide à la lecture pour les malvoyants. Cependant, les personnages de la scène sont difficiles à reconnaître avec une précision suffisante en raison de diverses distorsions de bruit et d'image. De plus, la reconnaissance des caractères des scènes japonaises est plus difficile et nécessite une grande quantité de données de caractères pour la formation, car des milliers de classes de caractères existent dans la langue. Certains chercheurs ont proposé des techniques d'augmentation des données de formation utilisant des données de personnages de scène synthétiques (SSCD) pour compenser le manque de données de formation. Dans cet article, nous proposons un filtre aléatoire qui est une nouvelle méthode de génération de SSCD, et introduisons un schéma d'ensemble avec la méthode Random Image Feature (RI-Feature). Comme il n'existe pas un grand ensemble de données sur les personnages de scènes japonaises pour l'évaluation des systèmes de reconnaissance, nous avons développé un ensemble de données ouvert JPSC1400, composé d'un grand nombre de personnages de scènes japonais réels. Il est montré que la précision a été améliorée de 70.9 % à 83.1 % en introduisant la méthode RI-Feature dans le schéma d'ensemble.
Fuma HORIE
Tohoku University
Hideaki GOTO
Tohoku University
Takuo SUGANUMA
Tohoku University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copier
Fuma HORIE, Hideaki GOTO, Takuo SUGANUMA, "Synthetic Scene Character Generator and Ensemble Scheme with the Random Image Feature Method for Japanese and Chinese Scene Character Recognition" in IEICE TRANSACTIONS on Information,
vol. E104-D, no. 11, pp. 2002-2010, November 2021, doi: 10.1587/transinf.2021EDP7058.
Abstract: Scene character recognition has been intensively investigated for a couple of decades because it has a great potential in many applications including automatic translation, signboard recognition, and reading assistance for the visually-impaired. However, scene characters are difficult to recognize at sufficient accuracy owing to various noise and image distortions. In addition, Japanese scene character recognition is more challenging and requires a large amount of character data for training because thousands of character classes exist in the language. Some researchers proposed training data augmentation techniques using Synthetic Scene Character Data (SSCD) to compensate for the shortage of training data. In this paper, we propose a Random Filter which is a new method for SSCD generation, and introduce an ensemble scheme with the Random Image Feature (RI-Feature) method. Since there has not been a large Japanese scene character dataset for the evaluation of the recognition systems, we have developed an open dataset JPSC1400, which consists of a large number of real Japanese scene characters. It is shown that the accuracy has been improved from 70.9% to 83.1% by introducing the RI-Feature method to the ensemble scheme.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2021EDP7058/_p
Copier
@ARTICLE{e104-d_11_2002,
author={Fuma HORIE, Hideaki GOTO, Takuo SUGANUMA, },
journal={IEICE TRANSACTIONS on Information},
title={Synthetic Scene Character Generator and Ensemble Scheme with the Random Image Feature Method for Japanese and Chinese Scene Character Recognition},
year={2021},
volume={E104-D},
number={11},
pages={2002-2010},
abstract={Scene character recognition has been intensively investigated for a couple of decades because it has a great potential in many applications including automatic translation, signboard recognition, and reading assistance for the visually-impaired. However, scene characters are difficult to recognize at sufficient accuracy owing to various noise and image distortions. In addition, Japanese scene character recognition is more challenging and requires a large amount of character data for training because thousands of character classes exist in the language. Some researchers proposed training data augmentation techniques using Synthetic Scene Character Data (SSCD) to compensate for the shortage of training data. In this paper, we propose a Random Filter which is a new method for SSCD generation, and introduce an ensemble scheme with the Random Image Feature (RI-Feature) method. Since there has not been a large Japanese scene character dataset for the evaluation of the recognition systems, we have developed an open dataset JPSC1400, which consists of a large number of real Japanese scene characters. It is shown that the accuracy has been improved from 70.9% to 83.1% by introducing the RI-Feature method to the ensemble scheme.},
keywords={},
doi={10.1587/transinf.2021EDP7058},
ISSN={1745-1361},
month={November},}
Copier
TY - JOUR
TI - Synthetic Scene Character Generator and Ensemble Scheme with the Random Image Feature Method for Japanese and Chinese Scene Character Recognition
T2 - IEICE TRANSACTIONS on Information
SP - 2002
EP - 2010
AU - Fuma HORIE
AU - Hideaki GOTO
AU - Takuo SUGANUMA
PY - 2021
DO - 10.1587/transinf.2021EDP7058
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E104-D
IS - 11
JA - IEICE TRANSACTIONS on Information
Y1 - November 2021
AB - Scene character recognition has been intensively investigated for a couple of decades because it has a great potential in many applications including automatic translation, signboard recognition, and reading assistance for the visually-impaired. However, scene characters are difficult to recognize at sufficient accuracy owing to various noise and image distortions. In addition, Japanese scene character recognition is more challenging and requires a large amount of character data for training because thousands of character classes exist in the language. Some researchers proposed training data augmentation techniques using Synthetic Scene Character Data (SSCD) to compensate for the shortage of training data. In this paper, we propose a Random Filter which is a new method for SSCD generation, and introduce an ensemble scheme with the Random Image Feature (RI-Feature) method. Since there has not been a large Japanese scene character dataset for the evaluation of the recognition systems, we have developed an open dataset JPSC1400, which consists of a large number of real Japanese scene characters. It is shown that the accuracy has been improved from 70.9% to 83.1% by introducing the RI-Feature method to the ensemble scheme.
ER -