Measured savings across 11 LLMs — Claude Opus 4.7 to Gemini Flash.→ See per-model data
Get free API key →

Have your own benchmark results? Add them to the leaderboard.

Submit your benchmark
v0 pricing data — quality scores pending

AI Model Compression Benchmark

Cost and quality metrics for frontier LLMs evaluated at 5× context compression. How much do you save when you compress before sending?

v0 data: Pricing is hand-verified from each provider's pricing page where one exists, with a small number of rows sourced from authoritative aggregators when no canonical page is published (Llama 4 Maverick, Mistral Large 3) — see “Pricing sources” below for the full list. Last verified . Quality scores (HumanEval, RULER, LLM-judge) are not yet collected and show — until live benchmark runs complete.

Price:
Context:
ModelProviderInput $/1MOutput $/1MContextHumanEval@5xRULER@5xLLM-Judge@5x$/Quality
DeepSeek V4-Flashsource↗DeepSeek$0.14$0.28128K
GPT-5.4 nanosource↗OpenAI$0.20$1.25128K
Gemini 3.1 Flash-Litesource↗Google$0.25$1.501M
Llama 4 Mavericksource↗Meta$0.27$0.85524K
Gemini 3 Flashsource↗Google$0.50$3.001M
Mistral Large 3source↗Mistral$0.50$1.50128K
GPT-5.4 minisource↗OpenAI$0.75$4.50128K
Claude Haiku 4.5source↗Anthropic$1.00$5.00200K
Gemini 3.1 Prosource↗Google$1.25$10.001M
Grok 4.3source↗xAI$1.25$2.50131K
Claude Sonnet 4.6source↗Anthropic$3.00$15.00200K
GPT-5.5source↗OpenAI$5.00$30.00128K
Claude Opus 4.7source↗Anthropic$15.00$75.001M

13 of 13 models shown

+ Submit a benchmark

Methodology

Compression ratio

Quality scores will be measured at exactly 5× compression using the gotcontext production compressor once live benchmark runs complete. Costs already reflect post-compression token counts.

Quality metrics

HumanEval pass@1, RULER long-context subset, and LLM-as-judge (0-100) measure how much quality survives compression. Higher is better.

Pricing

Input and output prices in USD per 1M tokens, verified against each provider's official pricing page where one exists, with the remaining rows sourced from authoritative aggregators. All rows include a source link.

See how much you save with compression

Use our calculator to estimate token savings across any model in this benchmark.

Open savings calculator