Model Category
Open tracker
Agents
Mistral Small 4
Efficient instruct
Reasoning 82.0
Coding 86.0
256k tokens
Claude Opus 4.8
High-autonomy agentic coding
Reasoning 96.0
Coding 96.0
1M tokens
Devstral 2
Code agents
Reasoning 82.0
Coding 92.0
256K tokens
Claude Sonnet 4.6
Balanced production use where you want strong reasoning
Reasoning 91.0
Coding 92.0
1M tokens
GPT-5.4 mini
High-volume agent backends
Reasoning 86.0
Coding 90.0
400K tokens