Coding LLM Benchmark

← Back to leaderboard

Official links

Pricing & capacity

Avg cost: $1.77 per run
Input price: $5.00 / 1M tokens
Output price: $25.00 / 1M tokens
Context window: 200K

Capabilities

Vision: Yes
Reasoning: Yes
Tool calls: Yes
Cursor: Yes
OpenRouter: Yes

Agent scores

Overall: 60.6%
Issue Resolution: 76.6%
Frontend: 41.2%
Greenfield: 37.5%
Testing: 78.5%
Information Gathering: 69.1%

Notes

Top-tier issue resolution and testing. Best suited for teams using extended thinking on complex debugging workflows.