Model Category
Open tracker
Reasoning
Mistral Small 4
Efficient instruct
Reasoning 82.0
Coding 86.0
256k tokens
DeepSeek-V4-Flash
Very low-cost long-context API workflows where DeepSeek compatibility and economics are attractive.
Reasoning 84.0
Coding 85.0
1M tokens
Claude Opus 4.8
High-autonomy agentic coding
Reasoning 96.0
Coding 96.0
1M tokens
Gemini 3.1 Pro Preview
Large-context reasoning
Reasoning 94.0
Coding 95.0
1M tokens
Qwen3-235B-A22B
Teams that want a serious open-weight frontier alternative with strong multilingual and agentic behavior.
Reasoning 90.0
Coding 91.0
128K tokens
DeepSeek V4 Pro
Cost-sensitive advanced reasoning and coding where context size still matters.
Reasoning 89.0
Coding 89.0
1M tokens
Claude Opus 4.7
Complex reasoning
Reasoning 95.0
Coding 95.0
1M tokens
GPT-5.5
High-stakes coding
Reasoning 96.0
Coding 97.0
1.05M tokens