YZ Index · API Documentation

Overview

YZ Index provides a RESTful JSON API. All endpoints accept GET requests, require no authentication, and support cross-origin access (CORS). Response data is encoded in UTF-8.

Base URL：https://www.winzheng.com/yz-index/api/
Response Format：application/json; charset=utf-8
All endpoints return "ok": true on success. On failure, they return "ok": false along with an "error" field.

Leaderboard Data

GET /yz-index/api/rankings

Retrieve model leaderboard data for a given dimension. By default, returns the latest published evaluation (full run) overall ranking.

Request Parameters

Parameter	Type	Required	Description
dimension	string	Optional	Sort dimension. Accepted values：execution_raw grounding_raw core_overall_display value stability. Default core_overall_display。 Legacy values coding/knowledge/longctx/overall still work, deprecated after 2026-06-30
run_id	int	Optional	Evaluation run ID. If omitted, the latest published run is used.

Response Example

{ "ok": true, "run_id": 16, "dimension": "core_overall_display", "run": { "id": 16, "run_type": "full", "status": "done", "started_at": "2026-03-16 00:30:00", "finished_at": "2026-03-16 03:45:00", "formula_version": "v3", "judge_set_version": "v5", "benchmark_version": "v7" // ... more run fields }, "rankings": [ { "model_slug": "claude-opus", "model_name": "Claude Opus 4", "model_version": "claude-opus-4-6-20250619", "provider": "anthropic", "execution_raw": 89.5, "grounding_raw": 85.2, "core_overall_display": 82.7, "integrity_label": "pass", "value": 62.3, "stability": 91.0, "availability": 100.0, // deprecated fields (sunset 2026-06-30) "coding": 89.5, "knowledge": 85.2, "longctx": 78.0, "overall": 82.7 } // ... more models ] }

This Week's Changes

GET /yz-index/api/changes

Retrieve model ranking change data for a given week. Returns three groups of models (rising, falling, stable) with change magnitudes.

Request Parameters

Parameter	Type	Required	Description
week	string	Optional	Week tag, format 2026-W12. If omitted, returns the latest week.

Response Example

{ "ok": true, "week": "2026-W12", "weeks": ["2026-W12", "2026-W11", "2026-W10"], "up": [ { "model_slug": "gpt-4o", "model_name": "GPT-4o", "direction": "up", "delta": 3.2, "current_score": 80.5, "previous_score": 77.3, "provider": "openai" // ... more fields } ], "down": [ // ... declining models ], "stable": [ // ... stable models ], "total": 11, "run": { "id": 16, "run_type": "full", "started_at": "2026-03-16 00:30:00", "model_count": 11 // ... more run fields } }

Specific Dimension and Run

GET /yz-index/api/rankings?dimension=execution_raw&run_id=16

By combining dimension and run_id parameters, retrieve the leaderboard for a specific evaluation run and dimension. Ideal for historical data comparison and in-depth dimension analysis.

Request Parameters

Parameter	Type	Required	Description
dimension	string	Required	Sort dimension; in this example coding, results are sorted by Code Execution score in descending order
run_id	int	Required	Evaluation run ID; in this example 16

Response Example

{ "ok": true, "run_id": 16, "dimension": "coding", "run": { "id": 16, "run_type": "full", "status": "done" // ... more run fields }, "rankings": [ { "model_slug": "claude-opus", "model_name": "Claude Opus 4", "coding": 89.5, "knowledge": 85.2, "longctx": 78.0, "value": 62.3, "stability": 91.0, "availability": 100.0, "overall": 82.7 // sorted by coding DESC } // ... more models ] }

Error Handling

When a server-side error occurs, the HTTP status code is 500 and the response has the following structure:

{ "ok": false, "error": "error description" }

When the requested dimension parameter is not in the allowed list, it automatically falls back to overall; when no evaluation data is available, an empty rankings array is returned instead of an error.

v1: Leaderboard

GET /yz-index/api/v1/leaderboard

Retrieve the overall leaderboard with ranking changes. Sorted by core_overall_display by default.

Request Parameters

Parameter	Type	Required	Description
dimension	string	Optional	Sort dimension：core_overall_display execution_raw grounding_raw. Default core_overall_display。 Legacy values overall/coding/knowledge/longctx still work, deprecated after 2026-06-30
limit	int	Optional	Number of models to return, 1-50. Default 11 (all).

Response Example

{ "status": "ok", "data": [ { "rank": 1, "model_name": "Claude Opus 4.6", "model_slug": "claude-opus-4.6", "score": 82.7, "change": 1.2 } ], "run_id": 37, "run_date": "2026-03-22 06:26:12", "attribution": "Powered by Winzheng Index (winzheng.com)" }

v1: Changes and Incidents

GET /yz-index/api/v1/changes

Retrieve the latest changes and incident data. Filter by model slug.

Request Parameters

Parameter	Type	Required	Description
model	string	Optional	Model slug, e.g. deepseek-v3. If omitted, returns all models.

Response Example

{ "status": "ok", "data": { "changes": [ { "model_slug": "gpt-4o", "model_name": "GPT-4o", "dimension": "core_overall_display", "delta": 3.2, "direction": "up", "summary": "execution & grounding both improved" } ], "incidents": [], "run_id": 37, "run_date": "2026-03-22 06:26:12", "engine_version": "v6" }, "attribution": "Powered by Winzheng Index (winzheng.com)" }

v1: Model Profile

GET /yz-index/api/v1/models/{slug}

Retrieve a model's detailed profile: scores, dimensions, pricing, and last 5 evaluation history entries. Does not return raw questions and answers.

Path Parameters

Parameter	Type	Required	Description
{slug}	string	Required	Model slug, e.g. claude-opus-4.6 or deepseek-v3

Response Example

{ "status": "ok", "data": { "name": "Claude Opus 4.6", "slug": "claude-opus-4.6", "provider": "anthropic", "scores": { "execution_raw": 89.5, "grounding_raw": 85.2, "core_overall_display": 82.7, "integrity_label": "pass" }, "dimensions": { "execution_raw": 89.5, "grounding_raw": 85.2, "judgment_raw": 76.8, "communication_raw": 81.3, "value": 62.3, "stability": 91.0, "availability": 100.0 }, "pricing": { "input_cost": 15.0, "output_cost": 75.0 }, "history": [ { "run_id": 37, "run_date": "2026-03-22", "core_overall_display": 82.7 } ] }, "attribution": "Powered by Winzheng Index (winzheng.com)" }

v1 General Specification

Rate Limit：60 requests per IP per minute; exceeding the limit returns 429 Too Many Requests
CORS：Access-Control-Allow-Origin: *
Cache：Cache-Control: public, max-age=3600(1 hour)
No API Key required, direct GET requests
All responses include an attribution field. Please retain the source when citing data.
Error Response Format：{"status":"error","error":"..."}

v6 Scoring Field Reference

v6 introduces an entirely new scoring dimension system. Below are the new fields and their meanings.

New Fields (v6)

Field	Type	Description
execution_raw	number	Code Execution raw score (0-100)
grounding_raw	number	Grounding raw score (0-100)
judgment_raw	number	Engineering Judgment raw score (0-100, side-panel AI-assisted evaluation)
communication_raw	number	Task Communication raw score (0-100, side-panel AI-assisted evaluation)
integrity_raw	number	Integrity Rating raw score (0-100)
integrity_label	string	Integrity label (pass/warn/fail)
recommendation_status	string	Recommendation status (recommended/neutral/not_recommended)
core_overall_raw	number	Overall raw score = 0.55 x execution + 0.45 x grounding
core_overall_display	number	Overall display score (capped at 74 on integrity fail)

v5 Compatibility Fields (sunset after 2026-06-30)

Field	Status	Description
coding	deprecated · sunset 2026-06-30	Coding score (legacy), migrate to execution_raw
knowledge	deprecated · sunset 2026-06-30	Knowledge score (v5), v6 split into multiple dimensions
longctx	deprecated · sunset 2026-06-30	Long context score (v5), migrate to grounding_raw
overall	deprecated · sunset 2026-06-30	Overall score (legacy), migrate to core_overall_display
official_coding	deprecated · sunset 2026-06-30	Official coding score (legacy), migrate to execution_raw
official_knowledge	deprecated · sunset 2026-06-30	Official knowledge score (v5), v6 split into multiple dimensions
official_longctx	deprecated · sunset 2026-06-30	Official grounding score (v5), migrate to grounding_raw
official_overall	deprecated · sunset 2026-06-30	Official overall score (legacy), migrate to core_overall_display
shadow_*	deprecated · sunset 2026-06-30	All shadow_ prefixed fields are deprecated

Field Disambiguation

integrity_label vs integrity_raw: integrity_label is a tier label (pass/warn/fail), integrity_raw is the 0-100 raw score. Use label for business logic, raw for trend analysis.

core_overall_display vs core_overall_raw: display is the frontend score (capped at 74 on integrity fail), raw is the uncapped weighted score. Sort by display.

Widget: Leaderboard Card

Displays Top N model rankings, scores, and ranking changes.

Embed Code

Configuration Attributes

Attribute	Description	Default Value
data-type	leaderboard	—
data-limit	Display model count	5
data-theme	dark or light	dark

Live Preview

Widget: Model Badge

A compact badge widget showing the model name, overall score, and ranking. Similar to a GitHub stars badge.

Embed Code

Configuration Attributes

Attribute	Description
data-type	badge
data-model	Model slug (required), e.g. deepseek-v3, claude-opus-4.6

Live Preview

Widget: Change Bulletin

Displays the models with the largest gains and drops this period, plus incident count.

Embed Code

Live Preview

WDCD Endpoints (Experimental)

WDCD (Winzheng Dynamic Contextual Decay) related endpoints. Experimental stage, interfaces may change.

GET /yz-index/api/v1/dcd

WDCD leaderboard with 3-round scores and main leaderboard comparison.

Parameter	Description
`run_id`	Optional. Evaluation run ID; defaults to latest published run
`format`	Optional. json (default) or csv

GET /yz-index/api/v1/dcd/runs

WDCD evaluation history (last 50 runs), including participating models and average scores.

GET /yz-index/api/v1/dcd/cases

Constraint violation case collection with dialogue summaries and scoring details.

Parameter	Description
`subtype`	Optional. Filter by scenario: data_boundary / resource_limit / business_rule / security / engineering
`model`	Optional. Filter by model slug
`limit`	Optional. 1-100, default 20

GET /yz-index/api/v1/dcd/decay

3-round constraint retention curve: pass rates and decay coefficients from R1→R2→R3 per model.

GET /yz-index/api/v1/dcd/matrix

5-category constraint scenario score matrix with average scores and hardest/easiest scenarios.

GET /yz-index/api/v1/dcd/models/{slug}

Single model WDCD details: current scores, scenario performance, historical trends.

Parameter	Description
`slug`	Required. Model slug (path parameter or ?slug= query parameter)

All WDCD endpoints: CORS open, 1-hour cache, rate limit 60 requests/min/IP.

Available Model Slugs

Model Name	slug	Provider
Claude Opus 4.7	claude-opus-4.7	claude
Claude Sonnet 4.6	claude-sonnet-4.6	claude
GPT-5.5	gpt-5.5	gpt
GPT-o3	gpt-o3	gpt
Grok 4	grok-4	grok
Gemini 2.5 Pro	gemini-2.5-pro	gemini
Gemini 3.1 Pro	gemini-3.1-pro	gemini
DeepSeek V4 Pro	deepseek-v4-pro	deepseek
GLM-4.6	glm-4.6	zhipu
Qwen3 Max	qwen3-max	qwen
豆包 Pro	doubao-pro	doubao

API Documentation

Overview

Leaderboard Data

Request Parameters

Response Example

This Week's Changes

Request Parameters

Response Example

Specific Dimension and Run

Request Parameters

Response Example

Error Handling

API v1（Recommended）

v1: Leaderboard

Request Parameters

Response Example

v1: Changes and Incidents

Request Parameters

Response Example

v1: Model Profile

Path Parameters

Response Example

v1 General Specification

v6 Scoring Field Reference

New Fields (v6)

Field Disambiguation

Widget Embed Components

Widget: Leaderboard Card

Embed Code

Configuration Attributes

Live Preview

Widget: Model Badge

Embed Code

Configuration Attributes

Live Preview

Widget: Change Bulletin

Embed Code

Live Preview

WDCD Endpoints (Experimental)

GET /yz-index/api/v1/dcd

GET /yz-index/api/v1/dcd/runs

GET /yz-index/api/v1/dcd/cases

GET /yz-index/api/v1/dcd/decay

GET /yz-index/api/v1/dcd/matrix

GET /yz-index/api/v1/dcd/models/{slug}

Available Model Slugs