Ego4d: Around the world in 3,000 hours of egocentric video K Grauman, A Westbury, E Byrne, Z Chavis, A Furnari, R Girdhar, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022 | 129 | 2022 |
Unsupervised Speaker Adaptation Using Attention-Based Speaker Memory for End-to-End ASR L Sarı, N Moritz, T Hori, J Le Roux ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 24 | 2020 |
A Multi-View Approach to Audio-Visual Speaker Verification L Sarı, K Singh, J Zhou, L Torresani, N Singhal, Y Saraf ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 17 | 2021 |
Fusion of LVCSR and posteriorgram based keyword search L Sarı, B Gündoğdu, M Saraçlar Sixteenth Annual Conference of the International Speech Communication …, 2015 | 16 | 2015 |
Pre-training of Speaker Embeddings for Low-latency Speaker Change Detection in Broadcast News L Sari, S Thomas, M Hasegawa-Johnson, M Picheny 2019 IEEE International Conference on Acoustics, Speech and Signal …, 2019 | 14 | 2019 |
Template-based keyword search with pseudo posteriorgrams B Gündoğdu, L Sarı, G Çetinkaya, M Saraçlar Signal Processing and Communication Application Conference (SIU), 2016 24th …, 2016 | 13 | 2016 |
Training Spoken Language Understanding Systems with Non-Parallel Speech and Text L Sarı, S Thomas, M Hasegawa-Johnson ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 12 | 2020 |
Elisa system description for lorehlt 2017 L Cheung, T Gowda, U Hermjakob, N Liu, J May, A Mayn, ... Proc. Low Resource Human Lang. Technol, 51-59, 2017 | 9 | 2017 |
Counterfactually fair automatic speech recognition L Sarı, M Hasegawa-Johnson, CD Yoo IEEE/ACM Transactions on Audio, Speech, and Language Processing 29, 3515-3525, 2021 | 8 | 2021 |
Learning Speaker Aware Offsets for Speaker Adaptation of Neural Networks L Sari, S Thomas, M Hasegawa-Johnson Interspeech, 769-773, 2019 | 8 | 2019 |
Towards Measuring Fairness in Speech Recognition: Casual Conversations Dataset Transcriptions C Liu, M Picheny, L Sarı, P Chitkara, A Xiao, X Zhang, M Chou, ... ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 7 | 2022 |
Auxiliary Networks for Joint Speaker Adaptation and Speaker Change Detection L Sari, M Hasegawa-Johnson, S Thomas IEEE/ACM Transactions on Audio, Speech, and Language Processing 29, 324-333, 2021 | 6 | 2021 |
Texture defect detection using independent vector analysis in wavelet domain L Sari, A Ertüzün 2014 22nd International Conference on Pattern Recognition, 1639-1644, 2014 | 6 | 2014 |
Seamless equal accuracy ratio for inclusive CTC speech recognition H Gao, X Wang, S Kang, R Mina, D Issa, J Harvill, L Sari, ... Speech Communication, 2021 | 5 | 2021 |
Posteriorgram based approaches in keyword search L Sarı, B Gündoğdu, M Saraçlar 2015 23nd Signal Processing and Communications Applications Conference (SIU …, 2015 | 5 | 2015 |
Speaker Adaptation with an Auxiliary Network L Sarı, M Hasegawa-Johnson Machine Learning in Speech and Language Processing Workshop (MLSLP), 2018 | 4 | 2018 |
Worldly Wise (WoW)-Cross-Lingual Knowledge Fusion for Fact-based Visual Spoken-Question Answering K Ramnath, L Sari, M Hasegawa-Johnson, C Yoo Proceedings of the 2021 Conference of the North American Chapter of the …, 2021 | 3 | 2021 |
Identify Speakers in Cocktail Parties with End-to-End Attention J Zhu, M Hasegawa-Johnson, L Sari Interspeech, 2020 | 2 | 2020 |
Speaker Adaptive Audio-Visual Fusion for the Open-Vocabulary Section of AVICAR L Sari, M Hasegawa-Johnson, S Kumaran, G Stemmer, NN Krishnakumar Interspeech, 2018 | 2 | 2018 |
Deep F-measure Maximization for End-to-End Speech Understanding L Sarı, M Hasegawa-Johnson arXiv preprint arXiv:2008.03425, 2020 | 1 | 2020 |