RTP-LLM: High-Performance Alibaba LLM Inference Engine

Published in arXiv preprint arXiv:2605.29639. May. 2026, 2026