Follow
Jiawei Zhao
Jiawei Zhao
Meta AI (FAIR)
Verified email at meta.com - Homepage
Title
Cited by
Cited by
Year
signSGD with majority vote is communication efficient and fault tolerant
J Bernstein, J Zhao, K Azizzadenesheli, A Anandkumar
arXiv preprint arXiv:1810.05291, 2018
2162018
Galore: Memory-efficient llm training by gradient low-rank projection
J Zhao, Z Zhang, B Chen, Z Wang, A Anandkumar, Y Tian
arXiv preprint arXiv:2403.03507, 2024
139*2024
Cost-effective training of deep cnns with active model adaptation
SJ Huang, JW Zhao, ZY Liu
Proceedings of the 24th ACM SIGKDD International Conference on Knowledge …, 2018
862018
Learning compositional functions via multiplicative weight updates
J Bernstein, J Zhao, M Meister, MY Liu, A Anandkumar, Y Yue
Advances in neural information processing systems 33, 13319-13330, 2020
242020
Lns-madam: Low-precision training in logarithmic number system using multiplicative weight update
J Zhao, S Dai, R Venkatesan, B Zimmer, M Ali, MY Liu, B Khailany, ...
IEEE Transactions on Computers 71 (12), 3179-3190, 2022
212022
Zero initialization: Initializing neural networks with only zeros and ones
J Zhao, F Schäfer, A Anandkumar
arXiv preprint arXiv:2110.12661, 2021
192021
Incremental fourier neural operator
J Zhao, RJ George, Y Zhang, Z Li, A Anandkumar
arXiv preprint arXiv:2211.15188, 2022
152022
Zero initialization: Initializing residual networks with only zeros and ones
J Zhao, FT Schaefer, A Anandkumar
112021
From galore to welore: How low-rank weights non-uniformly emerge from low-rank gradients
A Jaiswal, L Yin, Z Zhang, S Liu, J Zhao, Y Tian, Z Wang
arXiv preprint arXiv:2407.11239, 2024
102024
Q-galore: Quantized galore with int4 projection and layer-adaptive low-rank gradients
Z Zhang, A Jaiswal, L Yin, S Liu, J Zhao, Y Tian, Z Wang
arXiv preprint arXiv:2407.08296, 2024
102024
Inrank: Incremental low-rank learning
J Zhao, Y Zhang, B Chen, F Schäfer, A Anandkumar
arXiv preprint arXiv:2306.11250, 2023
102023
Machine learning training in logarithmic number system
Z Jiawei, SH Dai, R Venkatesan, MY Liu, WJ Dally, A Anandkumar
US Patent App. 17/346,100, 2022
62022
Incremental spectral learning in Fourier neural operator
J Zhao, RJ George, Z Li, A Anandkumar
arXiv preprint arXiv:2211.15188, 2022
52022
Incremental spatial and spectral learning of neural operators for solving large-scale PDEs
RJ George, J Zhao, J Kossaifi, Z Li, A Anandkumar
Transactions on Machine Learning Research, 2024
22024
SFT: Efficient, Scalable and Generalizable LLM Fine-tuning by Structured Sparsity
X Yang, J Leng, G Guo, J Zhao, R Nakada, L Zhang, H Yao, B Chen
arXiv preprint arXiv:2412.06289, 2024
12024
MINI-SEQUENCE TRANSFORMER: Optimizing Intermediate Memory for Long Sequences Training
C Luo, J Zhao, Z Chen, B Chen, A Anandkumar
arXiv preprint arXiv:2407.15892, 2024
12024
ParetoQ: Scaling Laws in Extremely Low-bit LLM Quantization
Z Liu, C Zhao, H Huang, S Chen, J Zhang, J Zhao, S Roy, L Jin, Y Xiong, ...
arXiv preprint arXiv:2502.02631, 2025
2025
Tensor-GaLore: Memory-Efficient Training via Gradient Tensor Decomposition
RJ George, D Pitt, J Zhao, J Kossaifi, C Luo, Y Tian, A Anandkumar
arXiv preprint arXiv:2501.02379, 2025
2025
Incremental Low-Rank Learning
J Zhao, Y Zhang, B Chen, FT Schaefer, A Anandkumar
Workshop on Efficient Systems for Foundation Models@ ICML2023, 0
The system can't perform the operation now. Try again later.
Articles 1–19