Gemini: a family of highly capable multimodal models G Team, R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, ... arXiv preprint arXiv:2312.11805, 2023 | 1583 | 2023 |
Palm 2 technical report R Anil, AM Dai, O Firat, M Johnson, D Lepikhin, A Passos, S Shakeri, ... arXiv preprint arXiv:2305.10403, 2023 | 1248 | 2023 |
Beyond the imitation game: Quantifying and extrapolating the capabilities of language models A Srivastava, A Rastogi, A Rao, AAM Shoeb, A Abid, A Fisch, AR Brown, ... arXiv preprint arXiv:2206.04615, 2022 | 1020 | 2022 |
Gemma: Open models based on gemini research and technology G Team, T Mesnard, C Hardin, R Dadashi, S Bhupatiraju, S Pathak, ... arXiv preprint arXiv:2403.08295, 2024 | 457 | 2024 |
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context M Reid, N Savinov, D Teplyashin, D Lepikhin, T Lillicrap, J Alayrac, ... arXiv preprint arXiv:2403.05530, 2024 | 395 | 2024 |
Ul2: Unifying language learning paradigms Y Tay, M Dehghani, VQ Tran, X Garcia, J Wei, X Wang, HW Chung, ... arXiv preprint arXiv:2205.05131, 2022 | 245 | 2022 |
Knowledge graph based synthetic corpus generation for knowledge-enhanced language model pre-training O Agarwal, H Ge, S Shakeri, R Al-Rfou arXiv preprint arXiv:2010.12688, 2020 | 214 | 2020 |
Pali-x: On scaling up a multilingual vision and language model X Chen, J Djolonga, P Padlewski, B Mustafa, S Changpinyo, J Wu, ... arXiv preprint arXiv:2305.18565, 2023 | 128 | 2023 |
End-to-end synthetic data generation for domain adaptation of question answering systems S Shakeri, CN Santos, H Zhu, P Ng, F Nan, Z Wang, R Nallapati, B Xiang arXiv preprint arXiv:2010.06028, 2020 | 97 | 2020 |
Sunipa Dev R Anil, AM Dai, O Firat, M Johnson, D Lepikhin, A Passos, S Shakeri, ... Jacob Devlin, Mark Díaz, Nan Du, Ethan Dyer, Vladimir Feinberg, Fangxiaoyu …, 2023 | 65 | 2023 |
Transcending scaling laws with 0.1% extra compute Y Tay, J Wei, HW Chung, VQ Tran, DR So, S Shakeri, X Garcia, HS Zheng, ... arXiv preprint arXiv:2210.11399, 2022 | 61 | 2022 |
Capabilities of gemini models in medicine K Saab, T Tu, WH Weng, R Tanno, D Stutz, E Wulczyn, F Zhang, ... arXiv preprint arXiv:2404.18416, 2024 | 59 | 2024 |
Embedding-based zero-shot retrieval through query generation D Liang, P Xu, S Shakeri, CN Santos, R Nallapati, Z Huang, B Xiang arXiv preprint arXiv:2009.10270, 2020 | 40 | 2020 |
Machine translation aided bilingual data-to-text generation and semantic parsing O Agarwal, M Kale, H Ge, S Shakeri, R Al-Rfou Proceedings of the 3rd international workshop on natural language generation …, 2020 | 38 | 2020 |
ParsiNLU: A Suite of Language Understanding Challenges for Persian D Khashabi, A Cohan, S Shakeri, P Hosseini, P Pezeshkpour, M Alikhani, ... Transactions of the Association for Computational Linguistics 9, 1147-1162, 2021 | 36 | 2021 |
Palm 2 technical report. arXiv 2023 R Anil, AM Dai, O Firat, M Johnson, D Lepikhin, A Passos, S Shakeri, ... arXiv preprint arXiv:2305.10403, 0 | 25 | |
Characterizing attribution and fluency tradeoffs for retrieval-augmented large language models R Aksitov, CC Chang, D Reitter, S Shakeri, Y Sung arXiv preprint arXiv:2302.05578, 2023 | 21 | 2023 |
Towards zero-shot multilingual synthetic question and answer generation for cross-lingual reading comprehension S Shakeri, N Constant, MS Kale, L Xue arXiv preprint arXiv:2010.12008, 2020 | 19 | 2020 |
Enct5: Fine-tuning t5 encoder for non-autoregressive tasks F Liu, S Shakeri, H Yu, J Li arXiv preprint arXiv:2110.08426 2, 2021 | 17 | 2021 |
Brainformers: Trading simplicity for efficiency Y Zhou, N Du, Y Huang, D Peng, C Lan, D Huang, S Shakeri, D So, ... International Conference on Machine Learning, 42531-42542, 2023 | 16 | 2023 |