Few-Bit Backward: Quantized Gradients of Activation Functions for Memory Footprint Reduction GS Novikov, D Bershatsky, J Gusak, A Shonenkov, DV Dimitrov, ... International Conference on Machine Learning, 26363-26381, 2023 | 8 | 2023 |
Survey on Large Scale Neural Network Training J Gusak, D Cherniuk, A Shilova, A Katrutsa, D Bershatsky, X Zhao, ... IJCAI-ECAI 2022-31st International Joint Conference on Artificial …, 2022 | 7 | 2022 |
Memory-Efficient Backpropagation through Large Linear Layers D Bershatsky, A Mikhalev, A Katrutsa, J Gusak, D Merkulov, I Oseledets arXiv preprint arXiv:2201.13195, 2022 | 4 | 2022 |
NAG-GS: Semi-Implicit, Accelerated and Robust Stochastic Optimizer V Leplat, D Merkulov, A Katrutsa, D Bershatsky, O Tsymboi, I Oseledets arXiv preprint arXiv:2209.14937, 2022 | 2 | 2022 |
LoTR: Low Tensor Rank Weight Adaptation D Bershatsky, D Cherniuk, T Daulbaev, A Mikhalev, I Oseledets arXiv preprint arXiv:2402.01376, 2024 | 1 | 2024 |
Federated Privacy-preserving Collaborative Filtering for On-Device Next App Prediction A Saiapin, G Balitskiy, D Bershatsky, A Katrutsa, E Frolov, A Frolov, ... User Modeling and User-Adapted Interaction, 1-30, 2024 | | 2024 |