EPD Disaggregation in SGLang: Elastic Encoder Scaling for Vision-Language Models
SGLang introduces Encoder-Prefill-Decode (EPD) disaggregation architecture that separates vision encoding from language processing in VLMs, enabling independent scaling and significantly reducing TTFT by 6-8x in image-intensive scenarios.