The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
Les algorithmes traditionnels d'amélioration de la parole basés sur les ondelettes sont inefficaces en présence d'un bruit hautement non stationnaire en raison des difficultés liées à l'estimation précise du spectre de bruit local. Dans cet article, une méthode simple d'estimation du bruit utilisant l'utilisation d'un détecteur d'activité vocale est proposée. Nous pouvons améliorer le résultat d'un algorithme d'amélioration de la parole basé sur des ondelettes en présence de salves de bruit aléatoires en fonction des résultats de la décision VAD. La parole bruyante est d'abord prétraitée à l'aide d'une décomposition de paquets d'ondelettes à l'échelle de l'écorce ( BSWPD ) pour convertir un signal bruyant en coefficients d'ondelettes (WC). Il s’avère que le paramètre VAD utilisant l’entropie spectrale à l’échelle de l’écorce, appelé BS-Entropie, est supérieur aux autres approches basées sur l’énergie, en particulier en termes de niveau de bruit variable. Le seuil de coefficient d'ondelette (WCT) de chaque sous-bande est ensuite ajusté temporellement en fonction du résultat de l'approche VAD. Dans une trame dominée par la parole, la parole est classée soit en une trame voisée, soit en une trame non voisée. Une trame voisée possède un fort spectre de type tonalité dans les sous-bandes inférieures, de sorte que les WC de la bande inférieure doivent être réservés. Au contraire, le WCT a tendance à augmenter dans la bande inférieure si la parole est classée comme non voisée. Dans une trame dominée par le bruit, le bruit de fond peut être presque complètement supprimé en augmentant le WCT. Les résultats expérimentaux objectifs et subjectifs sont ensuite utilisés pour évaluer le système proposé. Les expériences montrent que cet algorithme est valable sur diverses conditions de bruit, notamment pour le bruit de couleur et les conditions de bruit non stationnaire.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copier
Kun-Ching WANG, "An Adaptive Wavelet-Based Denoising Algorithm for Enhancing Speech in Non-stationary Noise Environment" in IEICE TRANSACTIONS on Information,
vol. E93-D, no. 2, pp. 341-349, February 2010, doi: 10.1587/transinf.E93.D.341.
Abstract: Traditional wavelet-based speech enhancement algorithms are ineffective in the presence of highly non-stationary noise because of the difficulties in the accurate estimation of the local noise spectrum. In this paper, a simple method of noise estimation employing the use of a voice activity detector is proposed. We can improve the output of a wavelet-based speech enhancement algorithm in the presence of random noise bursts according to the results of VAD decision. The noisy speech is first preprocessed using bark-scale wavelet packet decomposition ( BSWPD ) to convert a noisy signal into wavelet coefficients (WCs). It is found that the VAD using bark-scale spectral entropy, called as BS-Entropy, parameter is superior to other energy-based approach especially in variable noise-level. The wavelet coefficient threshold (WCT) of each subband is then temporally adjusted according to the result of VAD approach. In a speech-dominated frame, the speech is categorized into either a voiced frame or an unvoiced frame. A voiced frame possesses a strong tone-like spectrum in lower subbands, so that the WCs of lower-band must be reserved. On the contrary, the WCT tends to increase in lower-band if the speech is categorized as unvoiced. In a noise-dominated frame, the background noise can be almost completely removed by increasing the WCT. The objective and subjective experimental results are then used to evaluate the proposed system. The experiments show that this algorithm is valid on various noise conditions, especially for color noise and non-stationary noise conditions.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.E93.D.341/_p
Copier
@ARTICLE{e93-d_2_341,
author={Kun-Ching WANG, },
journal={IEICE TRANSACTIONS on Information},
title={An Adaptive Wavelet-Based Denoising Algorithm for Enhancing Speech in Non-stationary Noise Environment},
year={2010},
volume={E93-D},
number={2},
pages={341-349},
abstract={Traditional wavelet-based speech enhancement algorithms are ineffective in the presence of highly non-stationary noise because of the difficulties in the accurate estimation of the local noise spectrum. In this paper, a simple method of noise estimation employing the use of a voice activity detector is proposed. We can improve the output of a wavelet-based speech enhancement algorithm in the presence of random noise bursts according to the results of VAD decision. The noisy speech is first preprocessed using bark-scale wavelet packet decomposition ( BSWPD ) to convert a noisy signal into wavelet coefficients (WCs). It is found that the VAD using bark-scale spectral entropy, called as BS-Entropy, parameter is superior to other energy-based approach especially in variable noise-level. The wavelet coefficient threshold (WCT) of each subband is then temporally adjusted according to the result of VAD approach. In a speech-dominated frame, the speech is categorized into either a voiced frame or an unvoiced frame. A voiced frame possesses a strong tone-like spectrum in lower subbands, so that the WCs of lower-band must be reserved. On the contrary, the WCT tends to increase in lower-band if the speech is categorized as unvoiced. In a noise-dominated frame, the background noise can be almost completely removed by increasing the WCT. The objective and subjective experimental results are then used to evaluate the proposed system. The experiments show that this algorithm is valid on various noise conditions, especially for color noise and non-stationary noise conditions.},
keywords={},
doi={10.1587/transinf.E93.D.341},
ISSN={1745-1361},
month={February},}
Copier
TY - JOUR
TI - An Adaptive Wavelet-Based Denoising Algorithm for Enhancing Speech in Non-stationary Noise Environment
T2 - IEICE TRANSACTIONS on Information
SP - 341
EP - 349
AU - Kun-Ching WANG
PY - 2010
DO - 10.1587/transinf.E93.D.341
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E93-D
IS - 2
JA - IEICE TRANSACTIONS on Information
Y1 - February 2010
AB - Traditional wavelet-based speech enhancement algorithms are ineffective in the presence of highly non-stationary noise because of the difficulties in the accurate estimation of the local noise spectrum. In this paper, a simple method of noise estimation employing the use of a voice activity detector is proposed. We can improve the output of a wavelet-based speech enhancement algorithm in the presence of random noise bursts according to the results of VAD decision. The noisy speech is first preprocessed using bark-scale wavelet packet decomposition ( BSWPD ) to convert a noisy signal into wavelet coefficients (WCs). It is found that the VAD using bark-scale spectral entropy, called as BS-Entropy, parameter is superior to other energy-based approach especially in variable noise-level. The wavelet coefficient threshold (WCT) of each subband is then temporally adjusted according to the result of VAD approach. In a speech-dominated frame, the speech is categorized into either a voiced frame or an unvoiced frame. A voiced frame possesses a strong tone-like spectrum in lower subbands, so that the WCs of lower-band must be reserved. On the contrary, the WCT tends to increase in lower-band if the speech is categorized as unvoiced. In a noise-dominated frame, the background noise can be almost completely removed by increasing the WCT. The objective and subjective experimental results are then used to evaluate the proposed system. The experiments show that this algorithm is valid on various noise conditions, especially for color noise and non-stationary noise conditions.
ER -