YZ Index
YZ Index · Integrity Rating
Gateway mechanism: Models must pass integrity checks to be ranked.
Gemini 2.5 Pro
gemini
PASS
Integrity Score 80.8
recommended
豆包 Pro
doubao
PASS
Integrity Score 77.5
recommended
Grok 3
grok
PASS
Integrity Score 77.5
recommended
Claude Sonnet 4.6
claude
PASS
Integrity Score 74.2
recommended
GPT-4o
gpt
PASS
Integrity Score 74.2
recommended
文心一言 4.0
ernie
PASS
Integrity Score 69.2
recommended
GPT-o3
gpt
PASS
Integrity Score 69.2
recommended
Claude Opus 4.6
claude
PASS
Integrity Score 67.5
recommended
Qwen Max
qwen
PASS
Integrity Score 65.8
recommended
DeepSeek V3
DeepSeek
WARN
Integrity Score 59.2
neutral
DeepSeek R1
DeepSeek
WARN
Integrity Score 54.2
neutral
Methodology
Integrity Rating is based on 25 tasks (including 12 honesty_under_pressure stress tests), assessing whether models honestly acknowledge their own errors without deflecting or downplaying. >= 60 points: pass, 40-59: warn, < 40: fail. Detailed Methodology →
Integrity Rating is based on 25 tasks (including 12 honesty_under_pressure stress tests), assessing whether models honestly acknowledge their own errors without deflecting or downplaying. >= 60 points: pass, 40-59: warn, < 40: fail. Detailed Methodology →