Yuan Gong

Cited by

	All	Since 2019
Citations	2391	2361
h-index	19	19
i10-index	23	22

820

410

205

615

201820192020202120222023202422 54 136 178 477 817 692

Public access

View all

1 article

0 articles

available

not available

Based on funding mandates

Co-authors

James GlassMIT Computer Science and Artificial Intelligence LaboratoryVerified email at mit.edu
Christian PoellabauerProfessor, Florida International UniversityVerified email at cs.fiu.edu
Yu-An ChungFacebook AI Research (FAIR)Verified email at fb.com
Leonid KarlinskyPrincipal Research Scientist, MIT-IBM Watson AI Lab, IBM ResearchVerified email at ibm.com
Alexander H. LiuMassachusetts Institute of TechnologyVerified email at mit.edu
Andrew RouditchenkoPhD Student at MIT CSAILVerified email at mit.edu
Hongyin LuoMIT CSAILVerified email at mit.edu
Hilde KuehneUniversity of Bonn , MIT-IBM Watson LabVerified email at uni-bonn.de
Sameer KhuranaMitsubishi Electric Research Lab (MERL); MIT PhDVerified email at merl.com
Cheng-I Jeff LaiMassachusetts Institute of TechnologyVerified email at mit.edu
Bryan (Ning) XiaResearch Scientist, MicrosoftVerified email at microsoft.com
Yizhe ZhangNanjing University of Science and TechnologyVerified email at njust.edu.cn
Jian YangResearch Scientist, MetaVerified email at meta.com
Yoon KimAssistant Professor, MITVerified email at mit.edu
Yung-Sung ChuangMassachusetts Institute of TechnologyVerified email at mit.edu
David HarwathThe University of Texas at AustinVerified email at utexas.edu
Yiyu ShiUniversity of Notre DameVerified email at nd.edu
Boyang LiUniversity of Notre DameVerified email at nd.edu
Peng ChangPAII Inc.Verified email at paii-labs.com
Jin YuTeradyne/Northeastern UniversityVerified email at northeastern.edu

Yuan Gong

Research Scientist, MIT CSAIL

Verified email at mit.edu - Homepage

Audio Processing Speech Processing Natural Language Processing Large Language Models


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
AST: Audio Spectrogram Transformer Y Gong, YA Chung, J Glass Interspeech 2021, 2021	813	2021
SSAST: Self-Supervised Audio Spectrogram Transformer Y Gong, CIJ Lai, YA Chung, J Glass AAAI 2022, 2022	243	2022
Second-order non-local attention networks for person re-identification BN Xia, Y Gong, Y Zhang, C Poellabauer ICCV 2019, 3760-3769, 2019	239	2019
PSLA: Improving Audio Tagging with Pretraining, Sampling, Labeling, and Aggregation Y Gong, YA Chung, J Glass IEEE Transactions on Audio, Speech, and Language Processing, 2021	159	2021
Topic modeling based multi-modal depression detection Y Gong, C Poellabauer Proceedings of the 7th annual workshop on Audio/Visual emotion challenge, 69-76, 2017	140	2017
Crafting adversarial examples for speech paralinguistics applications Y Gong, C Poellabauer Proceedings of 2018 DYnamic and Novel Advances in Machine Learning and …, 2017	129	2017
Contrastive Audio-Visual Masked Autoencoder Y Gong, A Rouditchenko, AH Liu, D Harwath, L Karlinsky, H Kuehne, ... ICLR 2023, 2022	98	2022
Listen, Think, and Understand Y Gong, H Luo, AH Liu, L Karlinsky, J Glass ICLR 2024, 2023	71	2023
Real-time Adversarial Attacks Y Gong, B Li, C Poellabauer, Y Shi IJCAI 2019, 2019	62	2019
Transformer-based multi-aspect multi-granularity non-native english speaker pronunciation assessment Y Gong, Z Chen, IH Chu, P Chang, J Glass ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022	51	2022
Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers Y Gong, S Khurana, L Karlinsky, J Glass Interspeech 2023, 2023	49	2023
ReMASC: realistic replay attack corpus for voice controlled systems Y Gong, J Yang, J Huber, M MacKnight, C Poellabauer Interspeech 2019, 2019	42	2019
An overview of vulnerabilities of voice controlled systems Y Gong, C Poellabauer 1st International Workshop on Security and Privacy for the Internet-of …, 2018	39	2018
Protecting voice controlled systems using sound source identification based on acoustic cues Y Gong, C Poellabauer 2018 27th International Conference on Computer Communication and Networks …, 2018	38	2018
Cmkd: Cnn/transformer-based cross-model knowledge distillation for audio classification Y Gong, S Khurana, A Rouditchenko, J Glass arXiv preprint arXiv:2203.06760, 2022	31	2022
Detecting replay attacks using multi-channel audio: A neural network-based method Y Gong, J Yang, C Poellabauer IEEE Signal Processing Letters 27, 920-924, 2020	31	2020
Search augmented instruction learning H Luo, T Zhang, YS Chuang, Y Gong, Y Kim, X Wu, H Meng, J Glass Findings of the Association for Computational Linguistics: EMNLP 2023, 3717-3729, 2023	25*	2023
Impact of Aliasing on Deep CNN-Based End-to-End Acoustic Models Y Gong, C Poellabauer Interspeech 2018, 2698-2702, 2018	23	2018
Vocalsound: A dataset for improving human vocal sounds recognition Y Gong, J Yu, J Glass ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022	22	2022
Joint Audio and Speech Understanding Y Gong, AH Liu, H Luo, L Karlinsky, J Glass ASRU 2023, 2023	19	2023

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors