Follow
Bingyang Wu
Title
Cited by
Cited by
Year
A survey of resource-efficient llm and multimodal foundation models
M Xu, W Yin, D Cai, R Yi, D Xu, Q Wang, B Wu, Y Zhao, C Yang, S Wang, ...
arXiv preprint arXiv:2401.08092, 2024
822024
Fast distributed inference serving for large language models
B Wu, Y Zhong, Z Zhang, S Liu, F Liu, Y Sun, G Huang, X Liu, X Jin
arXiv preprint arXiv:2305.05920, 2023
782023
AMOS: enabling automatic mapping for tensor computations on spatial accelerators with hardware abstraction
S Zheng, R Chen, A Wei, Y Jin, Q Han, L Lu, B Wu, X Li, S Yan, Y Liang
Proceedings of the 49th Annual International Symposium on Computer …, 2022
592022
Transparent {GPU} sharing in container clouds for deep learning workloads
B Wu, Z Zhang, Z Bai, X Liu, X Jin
20th USENIX Symposium on Networked Systems Design and Implementation (NSDI …, 2023
402023
Loongserve: Efficiently serving long-context large language models with elastic sequence parallelism
B Wu, S Liu, Y Zhong, P Sun, X Liu, X Jin
Proceedings of the ACM SIGOPS 30th Symposium on Operating Systems Principles …, 2024
172024
Neoflow: A flexible framework for enabling efficient compilation for high performance dnn training
S Zheng, R Chen, Y Jin, A Wei, B Wu, X Li, S Yan, Y Liang
IEEE Transactions on Parallel and Distributed Systems 33 (11), 3220-3232, 2021
132021
{dLoRA}: Dynamically Orchestrating Requests and Adapters for {LoRA}{LLM} Serving
B Wu, R Zhu, Z Zhang, P Sun, X Liu, X Jin
18th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2024
112024
Xron: A hybrid elastic cloud overlay network for video conferencing at planetary scale
B Wu, K Qian, B Li, Y Ma, Q Zhang, Z Jiang, J Zhao, D Cai, E Zhai, X Liu, ...
Proceedings of the ACM SIGCOMM 2023 Conference, 696-709, 2023
52023
Rlhfuse: Efficient rlhf training for large language models with inter-and intra-stage fusion
Y Zhong, Z Zhang, B Wu, S Liu, Y Chen, C Wan, H Hu, L Xia, R Ming, ...
arXiv preprint arXiv:2409.13221, 2024
12024
The system can't perform the operation now. Try again later.
Articles 1–9