The graphical method of pauses detection in English speech signals
Resumen
ABSTRACT
This paper is devoted to the problem of pauses detection in English speech signals. The aim of the current study is to create a new method of speech pauses detection that has no drawbacks other algorithms suffer from. The analysis of it suggests the opportunity to use the graphical method in real-time applications. The article provides a new vision and a new solution of pauses detection problem. The result of the study – the graphical method – may be applied to real-time signal processing, text-to-speech synthesis or used to enrich knowledge about the specified problems.
RESUMEN
Este artículo está dedicado al problema de la detección de pausas en las señales de habla en inglés. El objetivo del estudio es crear un nuevo método de detección de pausas del habla que no tenga inconvenientes de otros algoritmos. Su análisis sugiere la oportunidad de utilizar el método gráfico en aplicaciones en tiempo real. El artículo proporciona una nueva visión y una nueva solución del problema de detección de pausas. El resultado del estudio, el método gráfico, puede aplicarse al procesamiento de señales en tiempo real, la síntesis de texto a voz o utilizarse para enriquecer el conocimiento sobre los problemas especificados.
Citas
BIBLIOGRAPHY
BOBYREVA, NN (2018). “Structure, Semantics, and Functions of Linguistic Signs in the Television Graphics of Sports Events Broadcasting”. The Journal of Social Sciences Research, pp. 417-420.
CAHN, JE (1990). "The generation of affect in synthesized speech." Journal of the American Voice I/O Society 8(1 ), pp.1-2.
CAMPIONE, E, & VÉRONIS, J (2002). “A large-scale multilingual study of silent pause duration”. In Speech prosody 2002, international conference. pp. 192-212.
FARSINEJAD, M, & ANALOUI, M (2008). “A new robust voice activity detection method based on genetic algorithm”. In 2008 Australasian Telecommunication Networks and Applications Conference, pp. 80-84.
KONDRATEVA, I, &NAZAROVA, M (2015). “Integration of science and language in teaching English”. Journal of English Language and Literature. 6(3), pp. 61-65.
LI, K, SWAMY, MNS, & AHMAD, MO (2005). “An improved voice activity detection using higher-order statistics”. IEEE Transactions on Speech and Audio Processing, 13(5), pp. 965-974.
LUTFULLINA, ANMGF, & MAKHMUTOVA, A (2017). Dependence of pragmatically implied meaning on aspectual-temporal semantics (based on the English and Russian language material). pp. 87-97.
MOATTAR, MH, & HOMAYOUNPOUR, MM (2009). “A simple but efficient real-time voice activity detection algorithm”. In the 2009 17th European Signal Processing Conference. pp. 2549-2553.
NASIBOV, Z, & KINNUNEN, T (2012). Decision fusion of voice activity detectors. pp. 8-11.
RABINER, LR, & SAMBUR, MR (1975). “An algorithm for determining the endpoints of isolated utterances”. Bell System Technical Journal, 54(2), pp.297-315.
SHEN, JL, HUNG, JW, & LEE, LS (1998). “Robust entropy-based endpoint detection for speech recognition in noisy environments”. In Fifth international conference on spoken language processing.
SHIN, WH (2000). “Speech/non-speech classification using multiple features for robust endpoint detection”. Acoustics, Speech, and Signal Processing, 3, pp. 1399-1402.
SOHN, J, & SUNG, W (1998). “A voice activity detector employing soft decision-based noise spectrum adaptation”. In Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP'98 (Cat. No. 98CH36181),1, pp. 365-368.
WU, GD, & LIN, CT (2000). “Word boundary detection with Mel-scale frequency bank in a noisy environment”. IEEE transactions on speech and audio processing, 8(5), pp. 541-554.
ZELLNER, B (1994).”Pauses and the temporal structure of speech”. In Zellner, B.(1994). Pauses and the temporal structure of speech, in E. Keller (Ed.) Fundamentals of speech synthesis and speech recognition. Chichester: John Wiley. pp. 41-62.