η-LSTM: Co-Designing Highly-Efficient Large LSTM Training via Exploiting Memory-Saving and Architectural Design Opportunities X Zhang, H Xia, D Zhuang, H Sun, X Fu, MB Taylor, SL Song 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture …, 2021 | 14 | 2021 |
Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity H Xia, Z Zheng, Y Li, D Zhuang, Z Zhou, X Qiu, Y Li, W Lin, SL Song arXiv preprint arXiv:2309.10285, 2023 | 13 | 2023 |
Shift-BNN: Highly-Efficient Probabilistic Bayesian Neural Network Training via Memory-Friendly Pattern Retrieving Q Wan, H Xia, X Zhang, L Wang, SL Song, X Fu MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture …, 2021 | 7 | 2021 |
Enabling Fast and Memory-Efficient Acceleration for Pattern Matching Workloads: The Lightweight Automata Processing Engine L Gong, C Wang, H Xia, X Chen, X Li, X Zhou IEEE Transactions on Computers 72 (4), 1011-1025, 2022 | 4 | 2022 |
LAP: A Lightweight Automata Processor for Pattern Matching Tasks H Xia, L Gong, C Wang, X Chen, X Zhou 2021 Design, Automation & Test in Europe Conference & Exhibition (DATE), 844-849, 2021 | 3 | 2021 |
FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design H Xia, Z Zheng, X Wu, S Chen, Z Yao, S Youn, A Bakhtiari, M Wyatt, ... arXiv preprint arXiv:2401.14112, 2024 | 1 | 2024 |
ZeroQuant (4+ 2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks X Wu, H Xia, S Youn, Z Zheng, S Chen, A Bakhtiari, M Wyatt, Y He, ... arXiv preprint arXiv:2312.08583, 2023 | 1 | 2023 |