AI技术 (2 articles)

KTransformers Accelerates SGLang's Heterogeneous Inference

KTransformers, developed by Tsinghua University's MadSys and Approaching.AI, optimizes CPU/GPU collaborative inference for sparse MoE models through AMX-optimized kernels, efficient device coordination, and expert deferral mechanisms, now integrated into SGLang for enhanced performance.

SGLang-Diffusion: Two Months of Progress

SGLang-Diffusion has achieved 2.5x performance improvements since its launch in November 2025, with support for new models, LoRA, parallel processing, and ComfyUI integration.