Follow
Zhehuai Chen
Zhehuai Chen
Verified email at nvidia.com - Homepage
Title
Cited by
Cited by
Year
Google usm: Scaling automatic speech recognition beyond 100 languages
Y Zhang, W Han, J Qin, Y Wang, A Bapna, Z Chen, N Chen, B Li, ...
arXiv preprint arXiv:2303.01037, 2023
1782023
Maestro: Matched speech text representations through modality matching
Z Chen, Y Zhang, A Rosenberg, B Ramabhadran, P Moreno, A Bapna, ...
arXiv preprint arXiv:2204.03409, 2022
892022
Progressive joint modeling in unsupervised single-channel overlapped speech recognition
Z Chen, J Droppo, J Li, W Xiong
IEEE/ACM Transactions on Audio, Speech, and Language Processing 26 (1), 184-196, 2017
822017
Knowledge Distillation for Sequence Model.
M Huang, Y You, Z Chen, Y Qian, K Yu
Interspeech, 3703-3707, 2018
712018
Improving speech recognition using consistent predictions on synthesized speech
G Wang, A Rosenberg, Z Chen, Y Zhang, B Ramabhadran, Y Wu, ...
ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020
582020
Joint Grapheme and Phoneme Embeddings for Contextual End-to-End ASR.
Z Chen, M Jain, Y Wang, ML Seltzer, C Fuegen
Interspeech, 3490-3494, 2019
552019
End-to-end contextual speech recognition using class language models and a token passing decoder
Z Chen, M Jain, Y Wang, ML Seltzer, C Fuegen
ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019
532019
Phone synchronous speech recognition with ctc lattices
Z Chen, Y Zhuang, Y Qian, K Yu
IEEE/ACM Transactions on Audio, Speech, and Language Processing 25 (1), 90-101, 2016
432016
Improving Speech Recognition Using GAN-Based Speech Synthesis and Contrastive Unspoken Text Selection.
Z Chen, A Rosenberg, Y Zhang, G Wang, B Ramabhadran, PJ Moreno
Interspeech, 556-560, 2020
382020
Injecting text in self-supervised speech pretraining
Z Chen, Y Zhang, A Rosenberg, B Ramabhadran, G Wang, P Moreno
2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU …, 2021
352021
On modular training of neural acoustics-to-word model for lvcsr
Z Chen, Q Liu, H Li, K Yu
2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018
342018
Joist: A joint speech and text streaming model for asr
TN Sainath, R Prabhavalkar, A Bapna, Y Zhang, Z Huo, Z Chen, B Li, ...
2022 IEEE Spoken Language Technology Workshop (SLT), 52-59, 2023
302023
Tacotron: Towards end-to-end speech synthesis. arXiv 2017
Y Wang, R Skerry-Ryan, D Stanton, Y Wu, RJ Weiss, N Jaitly, Z Yang, ...
arXiv preprint arXiv:1703.10135, 2017
292017
Phone Synchronous Decoding with CTC Lattice.
Z Chen, W Deng, T Xu, K Yu
Interspeech, 1923-1927, 2016
242016
Tts4pretrain 2.0: Advancing the use of text and speech in asr pretraining with consistency and contrastive losses
Z Chen, Y Zhang, A Rosenberg, B Ramabhadran, P Moreno, G Wang
ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022
222022
Sequence discriminative training for deep learning based acoustic keyword spotting
Z Chen, Y Qian, K Yu
Speech Communication 102, 100-111, 2018
222018
Sequence modeling in unsupervised single-channel overlapped speech recognition
Z Chen, J Droppo
2018 IEEE international conference on acoustics, speech and signal …, 2018
212018
Palm 2 technical report. arXiv 2023
R Anil, AM Dai, O Firat, M Johnson, D Lepikhin, A Passos, S Shakeri, ...
arXiv preprint arXiv:2305.10403, 0
20
A gpu-based wfst decoder with exact lattice generation
Z Chen, J Luitjens, H Xu, Y Wang, D Povey, S Khudanpur
arXiv preprint arXiv:1804.03243, 2018
192018
Accented speech recognition: Benchmarking, pre-training, and diverse data
A Aksënova, Z Chen, CC Chiu, D van Esch, P Golik, W Han, L King, ...
arXiv preprint arXiv:2205.08014, 2022
182022
The system can't perform the operation now. Try again later.
Articles 1–20