超低延迟
-
性能远超 vLLM 和 SGLang!TileRT:编译器驱动下的 Tile-Based Runtime
关键词:TileRT、超低延迟、LLM推理、tile 级运行时 、多GPU、编译器驱动 TileRT: Tile-Based Runtime for Ultra-Low-Latency LLM Inference https://github.com/tile-ai/TileRT https://github.com/tile-ai/TileRT/relea…