Have your own benchmark results? Add them to the leaderboard.
Submit your benchmarkCost and quality metrics for frontier LLMs evaluated at 5× context compression. How much do you save when you compress before sending?
v0 data: Pricing is hand-verified from each provider's pricing page where one exists, with a small number of rows sourced from authoritative aggregators when no canonical page is published (Llama 4 Maverick, Mistral Large 3) — see “Pricing sources” below for the full list. Last verified . Quality scores (HumanEval, RULER, LLM-judge) are not yet collected and show — until live benchmark runs complete.
| Model | Provider | Input $/1M | Output $/1M | Context | HumanEval@5x | RULER@5x | LLM-Judge@5x | $/Quality |
|---|---|---|---|---|---|---|---|---|
| DeepSeek V4-Flashsource↗ | DeepSeek | $0.14 | $0.28 | 128K | — | — | — | — |
| GPT-5.4 nanosource↗ | OpenAI | $0.20 | $1.25 | 128K | — | — | — | — |
| Gemini 3.1 Flash-Litesource↗ | $0.25 | $1.50 | 1M | — | — | — | — | |
| Llama 4 Mavericksource↗ | Meta | $0.27 | $0.85 | 524K | — | — | — | — |
| Gemini 3 Flashsource↗ | $0.50 | $3.00 | 1M | — | — | — | — | |
| Mistral Large 3source↗ | Mistral | $0.50 | $1.50 | 128K | — | — | — | — |
| GPT-5.4 minisource↗ | OpenAI | $0.75 | $4.50 | 128K | — | — | — | — |
| Claude Haiku 4.5source↗ | Anthropic | $1.00 | $5.00 | 200K | — | — | — | — |
| Gemini 3.1 Prosource↗ | $1.25 | $10.00 | 1M | — | — | — | — | |
| Grok 4.3source↗ | xAI | $1.25 | $2.50 | 131K | — | — | — | — |
| Claude Sonnet 4.6source↗ | Anthropic | $3.00 | $15.00 | 200K | — | — | — | — |
| GPT-5.5source↗ | OpenAI | $5.00 | $30.00 | 128K | — | — | — | — |
| Claude Opus 4.7source↗ | Anthropic | $15.00 | $75.00 | 1M | — | — | — | — |
13 of 13 models shown
+ Submit a benchmarkQuality scores will be measured at exactly 5× compression using the gotcontext production compressor once live benchmark runs complete. Costs already reflect post-compression token counts.
HumanEval pass@1, RULER long-context subset, and LLM-as-judge (0-100) measure how much quality survives compression. Higher is better.
Input and output prices in USD per 1M tokens, verified against each provider's official pricing page where one exists, with the remaining rows sourced from authoritative aggregators. All rows include a source link.
Use our calculator to estimate token savings across any model in this benchmark.
Open savings calculator