The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
Dans cet article, nous proposons une nouvelle modélisation de classe de locuteurs et sa méthode d'adaptation pour le système LVCSR et évaluons la méthode sur le Corpus du japonais spontané (CSJ). Dans cette méthode, des locuteurs plus proches sont sélectionnés parmi les locuteurs d'entraînement et les modèles acoustiques sont entraînés en utilisant leurs énoncés pour chaque locuteur d'évaluation. L’un des problèmes majeurs du modèle de classe de locuteurs est de déterminer la gamme de sélection des locuteurs. Afin de résoudre le problème, plusieurs modèles présentant une variété de gammes de locuteurs sont préparés à l'avance pour chaque locuteur d'évaluation, et le modèle le plus approprié est sélectionné sur une base de vraisemblance lors de l'étape de reconnaissance. De plus, nous avons amélioré les performances de reconnaissance en utilisant une adaptation non supervisée du locuteur avec les modèles de classe locuteur. Dans les expériences de reconnaissance, une amélioration significative pourrait être obtenue en utilisant l'adaptation de locuteur proposée basée sur des modèles de classes de locuteurs par rapport à la méthode d'adaptation conventionnelle.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copier
Tetsuo KOSAKA, Yuui TAKEDA, Takashi ITO, Masaharu KATO, Masaki KOHDA, "Unsupervised Speaker Adaptation Using Speaker-Class Models for Lecture Speech Recognition" in IEICE TRANSACTIONS on Information,
vol. E93-D, no. 9, pp. 2363-2369, September 2010, doi: 10.1587/transinf.E93.D.2363.
Abstract: In this paper, we propose a new speaker-class modeling and its adaptation method for the LVCSR system and evaluate the method on the Corpus of Spontaneous Japanese (CSJ). In this method, closer speakers are selected from training speakers and the acoustic models are trained by using their utterances for each evaluation speaker. One of the major issues of the speaker-class model is determining the selection range of speakers. In order to solve the problem, several models which have a variety of speaker range are prepared for each evaluation speaker in advance, and the most proper model is selected on a likelihood basis in the recognition step. In addition, we improved the recognition performance using unsupervised speaker adaptation with the speaker-class models. In the recognition experiments, a significant improvement could be obtained by using the proposed speaker adaptation based on speaker-class models compared with the conventional adaptation method.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.E93.D.2363/_p
Copier
@ARTICLE{e93-d_9_2363,
author={Tetsuo KOSAKA, Yuui TAKEDA, Takashi ITO, Masaharu KATO, Masaki KOHDA, },
journal={IEICE TRANSACTIONS on Information},
title={Unsupervised Speaker Adaptation Using Speaker-Class Models for Lecture Speech Recognition},
year={2010},
volume={E93-D},
number={9},
pages={2363-2369},
abstract={In this paper, we propose a new speaker-class modeling and its adaptation method for the LVCSR system and evaluate the method on the Corpus of Spontaneous Japanese (CSJ). In this method, closer speakers are selected from training speakers and the acoustic models are trained by using their utterances for each evaluation speaker. One of the major issues of the speaker-class model is determining the selection range of speakers. In order to solve the problem, several models which have a variety of speaker range are prepared for each evaluation speaker in advance, and the most proper model is selected on a likelihood basis in the recognition step. In addition, we improved the recognition performance using unsupervised speaker adaptation with the speaker-class models. In the recognition experiments, a significant improvement could be obtained by using the proposed speaker adaptation based on speaker-class models compared with the conventional adaptation method.},
keywords={},
doi={10.1587/transinf.E93.D.2363},
ISSN={1745-1361},
month={September},}
Copier
TY - JOUR
TI - Unsupervised Speaker Adaptation Using Speaker-Class Models for Lecture Speech Recognition
T2 - IEICE TRANSACTIONS on Information
SP - 2363
EP - 2369
AU - Tetsuo KOSAKA
AU - Yuui TAKEDA
AU - Takashi ITO
AU - Masaharu KATO
AU - Masaki KOHDA
PY - 2010
DO - 10.1587/transinf.E93.D.2363
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E93-D
IS - 9
JA - IEICE TRANSACTIONS on Information
Y1 - September 2010
AB - In this paper, we propose a new speaker-class modeling and its adaptation method for the LVCSR system and evaluate the method on the Corpus of Spontaneous Japanese (CSJ). In this method, closer speakers are selected from training speakers and the acoustic models are trained by using their utterances for each evaluation speaker. One of the major issues of the speaker-class model is determining the selection range of speakers. In order to solve the problem, several models which have a variety of speaker range are prepared for each evaluation speaker in advance, and the most proper model is selected on a likelihood basis in the recognition step. In addition, we improved the recognition performance using unsupervised speaker adaptation with the speaker-class models. In the recognition experiments, a significant improvement could be obtained by using the proposed speaker adaptation based on speaker-class models compared with the conventional adaptation method.
ER -