Conformer: Convolution-augmented transformer for speech recognition A Gulati, J Qin, CC Chiu, N Parmar, Y Zhang, J Yu, W Han, S Wang, ... arXiv preprint arXiv:2005.08100, 2020 | 767 | 2020 |
Pushing the limits of semi-supervised learning for automatic speech recognition Y Zhang, J Qin, DS Park, W Han, CC Chiu, R Pang, QV Le, Y Wu arXiv preprint arXiv:2010.10504, 2020 | 150 | 2020 |
Lingvo: a modular and scalable framework for sequence-to-sequence modeling J Shen, P Nguyen, Y Wu, Z Chen, MX Chen, Y Jia, A Kannan, T Sainath, ... arXiv preprint arXiv:1902.08295, 2019 | 142 | 2019 |
Contextnet: Improving convolutional neural networks for automatic speech recognition with global context W Han, Z Zhang, Y Zhang, J Yu, CC Chiu, J Qin, A Gulati, R Pang, Y Wu arXiv preprint arXiv:2005.03191, 2020 | 130 | 2020 |
Lamda: Language models for dialog applications R Thoppilan, D De Freitas, J Hall, N Shazeer, A Kulshreshtha, HT Cheng, ... arXiv preprint arXiv:2201.08239, 2022 | 51 | 2022 |
W2v-bert: Combining contrastive learning and masked language modeling for self-supervised speech pre-training YA Chung, Y Zhang, W Han, CC Chiu, J Qin, R Pang, Y Wu 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU …, 2021 | 50 | 2021 |
A better and faster end-to-end model for streaming asr B Li, A Gulati, J Yu, TN Sainath, CC Chiu, A Narayanan, SY Chang, ... ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 43 | 2021 |
Bigssl: Exploring the frontier of large-scale semi-supervised learning for automatic speech recognition Y Zhang, DS Park, W Han, J Qin, A Gulati, J Shor, A Jansen, Y Xu, ... IEEE Journal of Selected Topics in Signal Processing, 2022 | 21 | 2022 |
Scaling end-to-end models for large-scale multilingual ASR B Li, R Pang, TN Sainath, A Gulati, Y Zhang, J Qin, P Haghani, WR Huang, ... 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU …, 2021 | 18 | 2021 |
Conformer: Convolutionaugmented transformer for speech recognition. arXiv 2020 A Gulati, J Qin, CC Chiu, N Parmar, Y Zhang, J Yu, W Han, S Wang, ... arXiv preprint arXiv:2005.08100, 0 | 12 | |
An efficient streaming non-recurrent on-device end-to-end model with improvements to rare-word modeling TN Sainath, YR He, A Narayanan, R Botros, R Pang, DJ Rybach, ... | 11 | 2021 |
Parallel rescoring with transformer for streaming on-device speech recognition W Li, J Qin, CC Chiu, R Pang, Y He arXiv preprint arXiv:2008.13093, 2020 | 11 | 2020 |
Vector-quantized image modeling with improved vqgan J Yu, X Li, JY Koh, H Zhang, R Pang, J Qin, A Ku, Y Xu, J Baldridge, Y Wu arXiv preprint arXiv:2110.04627, 2021 | 10 | 2021 |
Self-supervised learning with random-projection quantizer for speech recognition CC Chiu, J Qin, Y Zhang, J Yu, Y Wu arXiv preprint arXiv:2202.01855, 2022 | 7 | 2022 |
Improving the latency and quality of cascaded encoders TN Sainath, Y He, A Narayanan, R Botros, W Wang, D Qiu, CC Chiu, ... ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 4 | 2022 |
LaMDA: Language Models for Dialog Applications AD Cohen, A Roberts, A Molina, A Butryna, A Jin, A Kulshreshtha, ... | | 2022 |
ading WER for Latency W Li, J Qin, CC Chiu, R Pang, Y He | | 2020 |
Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling Download PDF J Shen, P Nguyen, Y Wu, Z Chen, MX Chen, Y Jia, A Kannan, T Sainath, ... | | |