The roster

Models

40 models across 19 vendors, accessed through OpenRouter: current flagships, mid-tiers and small models. Every one predicted its own complete tournament — group scores, bracket, champion — before kickoff. The roster was frozen pre-kickoff; models added later would appear as unranked exhibition entries.

Model	Vendor	Tier	Knowledge cutoff	In / Out ($ per M tokens)
Jamba Large 1.7 ai21/jamba-large-1.7	AI21	flagship	unknown	$2 / $8	View
Qwen 2.5 72B qwen/qwen-2.5-72b-instruct	Alibaba	legacy	unknown	$0.36 / $0.4	View
Qwen3.6 Flash qwen/qwen3.6-flash	Alibaba	small	unknown	$0.1875 / $1.125	View
Qwen3.7 Max qwen/qwen3.7-max	Alibaba	flagship	unknown	$1.25 / $3.75	View
Qwen3.7 Plus qwen/qwen3.7-plus	Alibaba	mid	unknown	$0.4 / $1.6	View
Nova 2 Lite amazon/nova-2-lite-v1	Amazon	mid	unknown	$0.3 / $2.5	View
Claude 3 Haiku anthropic/claude-3-haiku	Anthropic	legacy	unknown	$0.25 / $1.25	View
Claude Fable 5 anthropic/claude-fable-5	Anthropic	flagship	unknown	$10 / $50	View
Claude Haiku 4.5 anthropic/claude-haiku-4.5	Anthropic	small	2025-02	$1 / $5	View
Claude Opus 4.8 anthropic/claude-opus-4.8	Anthropic	mid	unknown	$5 / $25	View
Command A cohere/command-a	Cohere	flagship	unknown	$2.5 / $10	View
DeepSeek V4 Flash deepseek/deepseek-v4-flash	DeepSeek	small	unknown	$0.0983 / $0.1966	View
DeepSeek V4 Pro deepseek/deepseek-v4-pro	DeepSeek	flagship	unknown	$0.435 / $0.87	View
Gemini 3.1 Flash Lite google/gemini-3.1-flash-lite	Google	small	unknown	$0.25 / $1.5	View
Gemini 3.1 Pro Preview google/gemini-3.1-pro-preview	Google	flagship	unknown	$2 / $12	View
Gemini 3.5 Flash google/gemini-3.5-flash	Google	mid	unknown	$1.5 / $9	View
Gemma 2 27B google/gemma-2-27b-it	Google	legacy	unknown	$0.65 / $0.65	View
Mercury 2 inception/mercury-2	Inception	oddball	unknown	$0.25 / $0.75	View
Llama 3 70B meta-llama/llama-3-70b-instruct	Meta	legacy	unknown	$0.51 / $0.74	View
Llama 4 Maverick meta-llama/llama-4-maverick	Meta	flagship	2024-08	$0.15 / $0.6	View
Llama 4 Scout meta-llama/llama-4-scout	Meta	small	2024-08	$0.1 / $0.3	View
WizardLM-2 8x22B microsoft/wizardlm-2-8x22b	Microsoft	oddball	unknown	$0.62 / $0.62	View
MiniMax M3 minimax/minimax-m3	MiniMax	flagship	unknown	$0.3 / $1.2	View
Mistral Medium 3.5 mistralai/mistral-medium-3-5	Mistral	flagship	unknown	$1.5 / $7.5	View
Mistral Small 4 mistralai/mistral-small-2603	Mistral	small	unknown	$0.15 / $0.6	View
Kimi K2.6 moonshotai/kimi-k2.6	Moonshot	flagship	unknown	$0.68 / $3.41	View
Hermes 3 405B nousresearch/hermes-3-llama-3.1-405b	Nous Research	oddball	unknown	$1 / $1	View
Nemotron 3 Ultra nvidia/nemotron-3-ultra-550b-a55b	NVIDIA	flagship	unknown	$0.5 / $2.5	View
GPT-3.5 Turbo openai/gpt-3.5-turbo	OpenAI	legacy	unknown	$0.5 / $1.5	View
GPT-4 openai/gpt-4	OpenAI	legacy	unknown	$30 / $60	View
GPT-4o openai/gpt-4o	OpenAI	legacy	unknown	$2.5 / $10	View
GPT-5.4 Mini openai/gpt-5.4-mini	OpenAI	small	2025-08	$0.75 / $4.5	View
GPT-5.4 Nano openai/gpt-5.4-nano	OpenAI	small	unknown	$0.2 / $1.25	View
GPT-5.5 openai/gpt-5.5	OpenAI	flagship	2025-12	$5 / $30	View
GPT-5.5 Pro openai/gpt-5.5-pro	OpenAI	flagship	unknown	$30 / $180	View
Hunyuan A13B tencent/hunyuan-a13b-instruct	Tencent	oddball	unknown	$0.14 / $0.57	View
Grok 4.20 x-ai/grok-4.20	xAI	mid	unknown	$1.25 / $2.5	View
Grok 4.3 x-ai/grok-4.3	xAI	flagship	unknown	$1.25 / $2.5	View
GLM 4.7 Flash z-ai/glm-4.7-flash	Z.AI	small	unknown	$0.06 / $0.4	View
GLM 5.1 z-ai/glm-5.1	Z.AI	flagship	unknown	$0.98 / $3.08	View

Knowledge cutoffs differ between models; that asymmetry is part of what the benchmark measures and is shown rather than corrected for. Full snapshot details in data/roster.json.