Iso Aus

Feb 10, 2026 1,017 Views - Read Source MLC

MLC ISO-AUS AI基准 LMSYS MLCommons SGLang

MLCommons, in collaboration with LMSYS Org, has recently released the ISO-AUS benchmark, an innovative AI model evaluation framework designed for isolation-aware serving scenarios.

Overview of ISO-AUS Benchmark

ISO-AUS (Isolation-Aware Serving Optimization) aims to simulate real-world production AI inference needs, emphasizing model isolation, fair resource allocation, and low-latency response. Unlike the traditional Chatbot Arena, ISO-AUS introduces multi-tenant load testing to evaluate model performance under shared resources.

Key Metrics

Elo Rating: A ranking system based on blind user tests, with GPT-4o leading at 1420 points.
Throughput: Queries per second, improved by 25% with SGLang optimization.
Isolation Score: Efficiency of preventing side-channel attacks, with open-source models averaging 85%.
Resource Utilization: Memory/CPU utilization kept within 90%.

Test Results Highlights

On standard datasets, Claude 3.5 Sonnet excels in complex queries, while Llama 3.1-405B achieves the best cost-performance ratio under SGLang, with latency reduced by 40%. The chart shows:

The benchmark is compatible with NVIDIA H100 and AMD MI300X hardware, supporting edge deployment.

Industry Impact

ISO-AUS fills a gap in AI benchmarks in the area of secure isolation, facilitating a smooth transition of models from lab to production. LMSYS Org stated that it will be integrated into Chatbot Arena to provide real-time Elo updates.

For more details, see the official link.

This article is from MLC blog, translated in full by Winzheng (winzheng.com). Click here to view the original When republishing the translation, please credit the source. Thank you!

Iso Aus

Overview of ISO-AUS Benchmark

Key Metrics

Test Results Highlights

Industry Impact

Related Reviews

LMSYS Accelerating SGLang HiCache with Netpreme X-Mem™ MPU

LMSYS DSpark in SGLang: Speculative Decoding with Confidence-Driven, Variable-Length Verification

LMSYS Bringing DeepSeek-V4 Flash RL Training to AMD Instinct MI355X GPUs with Miles

LMSYS Serving GLM5.2 NVFP4 Agentic Workload with SGLang: Reaching 500 TPS in 2 Weeks