Optimus: an efficient dynamic resource scheduler for deep learning clusters Y Peng, Y Bao, Y Chen, C Wu, C Guo Proceedings of the Thirteenth EuroSys Conference, 1-14, 2018 | 523 | 2018 |
A generic communication scheduler for distributed DNN training acceleration Y Peng, Y Zhu, Y Chen, Y Bao, B Yi, C Lan, C Wu, C Guo Proceedings of the 27th ACM Symposium on Operating Systems Principles, 16-29, 2019 | 375 | 2019 |
DL2: A deep learning-driven scheduler for deep learning clusters Y Peng, Y Bao, Y Chen, C Wu, C Meng, W Lin IEEE Transactions on Parallel and Distributed Systems 32 (8), 1947-1960, 2021 | 93 | 2021 |
{BGL}:{GPU-Efficient}{GNN} training by optimizing graph data {I/O} and preprocessing T Liu, Y Chen, D Li, C Wu, Y Zhu, J He, Y Peng, H Chen, H Chen, C Guo 20th USENIX Symposium on Networked Systems Design and Implementation (NSDI …, 2023 | 73 | 2023 |
Preemptive all-reduce scheduling for expediting distributed DNN training Y Bao, Y Peng, Y Chen, C Wu IEEE INFOCOM 2020-IEEE Conference on Computer Communications, 626-635, 2020 | 69 | 2020 |
Elastic parameter server load distribution in deep learning clusters Y Chen, Y Peng, Y Bao, C Wu, Y Zhu, C Guo Proceedings of the 11th ACM Symposium on Cloud Computing, 507-521, 2020 | 42 | 2020 |
SP-GNN: Learning structure and position information from graphs Y Chen, J You, J He, Y Lin, Y Peng, C Wu, Y Zhu Neural Networks 161, 505-514, 2023 | 13 | 2023 |
Sapipe: Staleness-aware pipeline for data parallel dnn training Y Chen, C Xie, M Ma, J Gu, Y Peng, H Lin, C Wu, Y Zhu Advances in Neural Information Processing Systems 35, 17981-17993, 2022 | 10 | 2022 |