Coding LLM Benchmark

eXalt Value presents

The Leaderboard of LLM Coding

Compare the models of coding by scores and prices

Find the ideal coding model

Ask what you need to build — the agent will suggest models from the leaderboard.

Top models across programming benchmarks

Overall score

Leaderboard average

Score (%)
30%35%40%45%50%55%60%65%70%75%80%85%

Issue Resolution

Fixing GitHub bugs

Score (%)
30%35%40%45%50%55%60%65%70%75%80%85%

Frontend

UI with visual context

Score (%)
30%35%40%45%50%55%60%65%70%75%80%85%

Greenfield

Apps from scratch

Score (%)
30%35%40%45%50%55%60%65%70%75%80%85%

Testing

Test generation and quality

Score (%)
30%35%40%45%50%55%60%65%70%75%80%85%

Information Gathering

Research and retrieval

Score (%)
30%35%40%45%50%55%60%65%70%75%80%85%

Cost / Performance

Overall Cost/Performance

Average score vs. average cost per problem (USD). Lower-right is better value.

Explore

Model Value In $/1M Out $/1M LiveCode Aider SWE BFCL Votes ↓ Context
Claude Opus 4.6 $5.00 $25.00 n/a n/a n/a 77.47% 1561 200K
Claude Opus 4.5 Low $5.00 $25.00 73.8% 89.4% 80.9% 73.24% 1469 200K
MiniMax M2.5 $0.30 $1.20 n/a n/a n/a 57.51% 1453 1M
Gemini 3 Pro Mid $2.00 $12.00 79.7% n/a 76.2% 66.46% 1444 10M
Kimi K2 Thinking Good value $0.40 $1.75 83.1% 59.1% 71.3% 59.42% 1442 256K
Gemini 3 Flash Good value $0.50 $3.00 79.7% n/a n/a 60.61% 1441 1M
GPT-5.2 Low $1.75 $14.00 66.9% n/a 80.0% 63.01% 1395 400K
GPT-5 Good value $1.25 $10.00 84.6% 88.0% 74.9% 66.21% 1393 400K
Claude Sonnet 4.5 Low $3.00 $15.00 59.0% n/a 82.0% 60.67% 1386 200K
DeepSeek V3.2 Thinking Best value $0.27 $1.10 89.6% 74.2% n/a 62.11% 1371 128K
GPT-5.1 Codex Mid $1.25 $10.00 84.9% n/a 76.3% 65.18% 1328 200K
DeepSeek V3.2 Best value $0.27 $0.41 59.3% 70.2% n/a 52.56% 1315 128K
Claude Haiku 4.5 $1.00 $5.00 n/a n/a 73.3% 54.84% 1305 200K
Mistral Large 3 $2.00 $6.00 n/a n/a n/a 39.17% 1223 131K
Gemini 2.5 Pro Mid $1.25 $10.00 69.0% 82.2% 59.6% 54.41% 1205 1M
Claude Sonnet 4.6 $3.00 $15.00 n/a n/a n/a n/a n/a 1M
GPT-5.2 Codex $1.75 $14.00 n/a n/a n/a n/a n/a 400K
Gemini 3.1 Pro $2.00 $12.00 n/a n/a n/a n/a n/a 10M
Kimi K2.5 $0.40 $1.75 n/a n/a n/a n/a n/a 256K
MiniMax M2.1 $0.23 $0.90 n/a n/a n/a n/a n/a 197K
Qwen3 Coder 480B $0.90 $0.90 n/a n/a n/a n/a n/a 262K
GLM 4.7 $0.38 $1.75 n/a n/a n/a n/a n/a 203K
GPT-5 Mini Best value $0.25 $2.00 83.8% n/a n/a 58.29% n/a 200K
OpenAI o3 Good value $2.00 $8.00 80.8% 81.3% 69.1% 68.09% n/a 200K
Grok 4 Low $3.00 $15.00 79.0% 79.6% 75.0% 62.9% n/a 256K
Gemini 2.5 Flash Best value $0.15 $0.60 63.5% 55.1% n/a 45.18% n/a 1M
GPT-4.1 Mid $2.00 $8.00 52.0% 52.4% 55.0% 50.18% n/a 1M
Best Good Mid Low