Вопросы радиоэлектроники. 2019; : 47-52
РАСПОЗНАВАНИЕ РЕЧИ НА ОСНОВЕ СВЕРТОЧНЫХ НЕЙРОННЫХ СЕТЕЙ
Белоруцкий Р. Ю., Житник С. В.
https://doi.org/10.21778/2218-5453-2019-4-47-52Аннотация
Рассматривается задача распознавания речи человека в виде записанных на диктофон сигналов произнесенных цифр от 1 до 10. Использован метод распознавания спектрограммы звукового сигнала с помощью сверточных нейронных сетей. Реализованы алгоритмы для предварительной обработки входных данных – изображений спектрограмм, а также алгоритмы для обучения сети и распознавания произнесенных слов. Оценено качество распознавания для разного количества сверточных слоев. Исходя из этого, выбрано их число, предложена структура нейронной сети. Произведено сравнение качества распознавания в случаях, когда входными данными для сети являются спектрограмма звукового сигнала или выделенные из нее первые две форманты. Тестирование алгоритма распознавания произведено на примерах мужского и женского голосов с разной длительностью произношения.
Список литературы
1. Tebelskis J. Speech recognition using neural networks. Pittsburgh: Carnegie Mellon University, 1995. 180 p.
2. Juang B.H. Automatic speech recognition. Atlanta: Georgia Institute of Technology, 2000. P. 1–24.
3. Hazrati O., Ghaffarzadegan S., Hansen J.H.L. Leveraging automatic speech recognition in cochlear implants for improved speech intelligibility under reverberation. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Brisbane, 2015. P. 5093–5097.
4. Suh Y., et al. Development of distant multi channel speech and noise databases for speech recognition by in door conversational robots. 20th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment (O COCOSDA). Seoul, 2017. P. 1–4.
5. Meltzner G.S., Heaton J.T., Deng Y., et al. Silent speech recognition as an alternative communication device for persons with laryngectomy. IEEE/ACM Transactions on Audio, Speech, and Language Processing. Vol. 25. № 12. P. 2386–2398.
6. Dominguez Morales J. P., et al. Deep spiking neural network model for time variant signals classification: a real time speech recognition approach. 2018 International Joint Conference on Neural Networks (IJCNN). Rio de Janeiro, 2018. P. 1–8.
7. Chollet F. Deep lerning with Python. Shelter Island: Manning Publication, 2018. 384 p.
8. Stanford Vision Lab. ImageNet Large Scale Visual Recognition Challenge (ILSVRC) [Электронный ресурс]. URL: http:// image net.org/challenges/LSVRC (дата обращения: 13.12.2018).
9. Guo T., Dong J., Li H., Gao Y. Simple convolutional neural network on image classification. IEEE 2nd International Conference on Big Data Analysis (ICBDA). Beijing, 2017. P. 721–724.
10. Albawi S., Mohammed T.A., Al Zawi S. Understanding of a convolutional neural network. International Conference on Engineering and Technology (ICET). Antalya, 2017. P. 1–6.
11. Pieraccini R. The voice in the machine. Building computers that understand speech. Cambridge, Massachusetts: MIT Press, 2012. 360 p.
Issues of radio electronics. 2019; : 47-52
SPEECH RECOGNITION BASED ON CONVOLUTION NEURAL NETWORKS
Belorutsky R. Yu., Zhitnik S. V.
https://doi.org/10.21778/2218-5453-2019-4-47-52Abstract
The problem of recognizing a human speech in the form of digits from one to ten recorded by dictaphone is considered. The method of the sound signal spectrogram recognition by means of convolutional neural networks is used. The algorithms for input data preliminary processing, networks training and words recognition are realized. The recognition accuracy for different number of convolution layers is estimated. Its number is determined and the structure of neural network is proposed. The comparison of recognition accuracy when the input data for the network is spectrogram or first two formants is carried out. The recognition algorithm is tested by male and female voices with different duration of pronunciation.
References
1. Tebelskis J. Speech recognition using neural networks. Pittsburgh: Carnegie Mellon University, 1995. 180 p.
2. Juang B.H. Automatic speech recognition. Atlanta: Georgia Institute of Technology, 2000. P. 1–24.
3. Hazrati O., Ghaffarzadegan S., Hansen J.H.L. Leveraging automatic speech recognition in cochlear implants for improved speech intelligibility under reverberation. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Brisbane, 2015. P. 5093–5097.
4. Suh Y., et al. Development of distant multi channel speech and noise databases for speech recognition by in door conversational robots. 20th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment (O COCOSDA). Seoul, 2017. P. 1–4.
5. Meltzner G.S., Heaton J.T., Deng Y., et al. Silent speech recognition as an alternative communication device for persons with laryngectomy. IEEE/ACM Transactions on Audio, Speech, and Language Processing. Vol. 25. № 12. P. 2386–2398.
6. Dominguez Morales J. P., et al. Deep spiking neural network model for time variant signals classification: a real time speech recognition approach. 2018 International Joint Conference on Neural Networks (IJCNN). Rio de Janeiro, 2018. P. 1–8.
7. Chollet F. Deep lerning with Python. Shelter Island: Manning Publication, 2018. 384 p.
8. Stanford Vision Lab. ImageNet Large Scale Visual Recognition Challenge (ILSVRC) [Elektronnyi resurs]. URL: http:// image net.org/challenges/LSVRC (data obrashcheniya: 13.12.2018).
9. Guo T., Dong J., Li H., Gao Y. Simple convolutional neural network on image classification. IEEE 2nd International Conference on Big Data Analysis (ICBDA). Beijing, 2017. P. 721–724.
10. Albawi S., Mohammed T.A., Al Zawi S. Understanding of a convolutional neural network. International Conference on Engineering and Technology (ICET). Antalya, 2017. P. 1–6.
11. Pieraccini R. The voice in the machine. Building computers that understand speech. Cambridge, Massachusetts: MIT Press, 2012. 360 p.
События
-
Журнал «Літасфера» присоединился к Elpub! >>>
22 июл 2025 | 11:00 -
К платформе Elpub присоединился журнал «Труды НИИСИ» >>>
21 июл 2025 | 10:43 -
Журнал «Успехи наук о животных» присоединился к Elpub! >>>
18 июл 2025 | 12:37 -
Журнал «Наука. Инновации. Технологии» принят в DOAJ >>>
17 июл 2025 | 12:17 -
К платформе Elpub присоединился журнал « Библиотечный мир» >>>
15 июл 2025 | 12:17