Coding LLM Benchmark

← Back to leaderboard

Official links

Pricing & capacity

Avg cost: $0.80 per run
Input price: $2.00 / 1M tokens
Output price: $12.00 / 1M tokens
Context window: 10M

Capabilities

Vision: Yes
Reasoning: Yes
Tool calls: Yes
Cursor: No
OpenRouter: No

Agent scores

Overall: 55.7%
Issue Resolution: 75.4%
Frontend: 44.1%
Greenfield: 18.8%
Testing: 64.0%
Information Gathering: 76.4%

Notes

Top non-Anthropic model on OpenHands (55.7% avg). Strong info gathering (76.4%) and issue resolution (75.4%). 10M context window.