MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models C Fu, P Chen, Y Shen, Y Qin, M Zhang, X Lin, J Yang, X Zheng, K Li, ... arXiv preprint arXiv:2306.13394, 2023 | 947* | 2023 |
CM-NAS: Cross-Modality Neural Architecture Search for Visible-Infrared Person Re-Identification C Fu, Y Hu, X Wu, H Shi, T Mei, R He IEEE International Conference on Computer Vision (ICCV), 2021, 2021 | 134 | 2021 |
Woodpecker: Hallucination Correction for Multimodal Large Language Models S Yin, C Fu, S Zhao, T Xu, H Wang, D Sui, Y Shen, K Li, X Sun, E Chen arXiv preprint arXiv:2310.16045, 2023 | 130 | 2023 |
Information Bottleneck Disentanglement for Identity Swapping G Gao, H Huang, C Fu, Z Li, R He IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021, 2021 | 103 | 2021 |
DVG-Face: Dual Variational Generation for Heterogeneous Face Recognition C Fu, X Wu, Y Hu, H Huang, R He IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022, 2022 | 101 | 2022 |
Dual Variational Generation for Low Shot Heterogeneous Face Recognition C Fu, X Wu, Y Hu, H Huang, R He Advances in Neural Information Processing Systems (NeurIPS), 2019, 2019 | 84 | 2019 |
Cross-Spectral Face Hallucination via Disentangling Independent Factors B Duan, C Fu, Y Li, X Song, R He IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020, 2020 | 72 | 2020 |
High-Fidelity Face Manipulation With Extreme Poses and Expressions C Fu, Y Hu, X Wu, G Wang, Q Zhang, R He IEEE Transactions on Information Forensics and Security (TIFS), 2021, 2021 | 67 | 2021 |
Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-Modal LLMs in Video Analysis C Fu, Y Dai, Y Luo, L Li, S Ren, R Zhang, Z Wang, C Zhou, Y Shen, ... arXiv preprint arXiv:2405.21075, 2024 | 65 | 2024 |
A Challenger to GPT-4V? Early Explorations of Gemini in Visual Expertise C Fu, R Zhang, H Lin, Z Wang, T Gao, Y Luo, Y Huang, Z Zhang, L Qiu, ... arXiv preprint arXiv:2312.12436, 2023 | 44 | 2023 |
AOT: Appearance Optimal Transport Based Identity Swapping for Forgery Detection H Zhu, C Fu, Q Wu, W Wu, C Qian, R He Advances in Neural Information Processing Systems (NeurIPS), 2020, 2020 | 32 | 2020 |
Multi-modal Queried Object Detection in the Wild Y Xu, M Zhang, C Fu, P Chen, X Yang, K Li, C Xu Advances in Neural Information Processing Systems (NeurIPS), 2023, 2023 | 24 | 2023 |
Heterogeneous Face Recognition via Face Synthesis With Identity-Attribute Disentanglement Z Yang, J Liang, C Fu, M Luo, XY Zhang IEEE Transactions on Information Forensics and Security (TIFS), 2022, 2022 | 24 | 2022 |
VITA: Towards Open-Source Interactive Omni Multimodal LLM C Fu, H Lin, Z Long, Y Shen, M Zhao, Y Zhang, X Wang, D Yin, L Ma, ... arXiv preprint arXiv:2408.05211, 2024 | 18 | 2024 |
Aligning and Prompting Everything All at Once for Universal Visual Perception Y Shen, C Fu, P Chen, M Zhang, K Li, X Sun, Y Wu, S Lin, R Ji IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024, 2024 | 18 | 2024 |
Beyond LLaVA-HD: Diving into High-Resolution Large Multimodal Models YF Zhang, Q Wen, C Fu, X Wang, Z Zhang, L Wang, R Jin arXiv preprint arXiv:2406.08487, 2024 | 16 | 2024 |
Deep Momentum Uncertainty Hashing C Fu, G Wang, X Wu, Q Zhang, R He Pattern Recognition (PR), 2022, 2022 | 14 | 2022 |
Rethinking Image Cropping: Exploring Diverse Compositions From Global Views G Jia, H Huang, C Fu, R He IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022, 2022 | 13 | 2022 |
MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans? YF Zhang, H Zhang, H Tian, C Fu, S Zhang, J Wu, F Li, K Wang, Q Wen, ... arXiv preprint arXiv:2408.13257, 2024 | 11 | 2024 |
Pareidolia Face Reenactment L Song, W Wu, C Fu, C Qian, CC Loy, R He IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021, 2021 | 11 | 2021 |