Follow
Haojun Xia
Title
Cited by
Cited by
Year
η-LSTM: Co-Designing Highly-Efficient Large LSTM Training via Exploiting Memory-Saving and Architectural Design Opportunities
X Zhang, H Xia, D Zhuang, H Sun, X Fu, MB Taylor, SL Song
2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture …, 2021
142021
Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity
H Xia, Z Zheng, Y Li, D Zhuang, Z Zhou, X Qiu, Y Li, W Lin, SL Song
arXiv preprint arXiv:2309.10285, 2023
132023
Shift-BNN: Highly-Efficient Probabilistic Bayesian Neural Network Training via Memory-Friendly Pattern Retrieving
Q Wan, H Xia, X Zhang, L Wang, SL Song, X Fu
MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture …, 2021
72021
Enabling Fast and Memory-Efficient Acceleration for Pattern Matching Workloads: The Lightweight Automata Processing Engine
L Gong, C Wang, H Xia, X Chen, X Li, X Zhou
IEEE Transactions on Computers 72 (4), 1011-1025, 2022
42022
LAP: A Lightweight Automata Processor for Pattern Matching Tasks
H Xia, L Gong, C Wang, X Chen, X Zhou
2021 Design, Automation & Test in Europe Conference & Exhibition (DATE), 844-849, 2021
32021
FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design
H Xia, Z Zheng, X Wu, S Chen, Z Yao, S Youn, A Bakhtiari, M Wyatt, ...
arXiv preprint arXiv:2401.14112, 2024
12024
ZeroQuant (4+ 2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks
X Wu, H Xia, S Youn, Z Zheng, S Chen, A Bakhtiari, M Wyatt, Y He, ...
arXiv preprint arXiv:2312.08583, 2023
12023
The system can't perform the operation now. Try again later.
Articles 1–7