What do the gotcontext.ai plans cost?

gotcontext.ai has five tiers: Free ($0/mo, 1,000 compressions, 1 seat), Pro ($49/mo, 50,000 compressions, 1 seat), Team ($99/mo, 100,000 compressions pooled, unlimited seats), Business ($199/mo, 500,000 compressions, unlimited seats, compliance controls), and Enterprise Dedicated ($499/mo, reserved capacity, single-tenant, custom MSA).

Do all plans include the same MCP tools?

All paid plans (Pro, Team, Business, Enterprise Dedicated) include all 155 MCP tools. Plans differ only on monthly compression volume, embedding fidelity tier, and enterprise wraparound features — not on which tools are available. The Free tier includes select tools including gc_lookup.

How much can I save on LLM API costs?

We state a conservative ~50% typical token reduction (40-60% by document size; the live /v1/global-savings average runs higher). At Claude Sonnet 4 pricing ($3/MTok input), a Pro-tier user running 50,000 compressions of average-sized documents saves roughly $375/month in model costs — enough to pay for the plan several times over. Actual savings vary by document size and model.

Is there a free trial or money-back guarantee?

The Free tier requires no credit card and gives you 1,000 compressions per month to evaluate the service. There is no time-limited free trial on paid plans, but you can upgrade, downgrade, or cancel any time — no annual commitment required on monthly plans.

What is the Team plan seat limit?

The Team plan ($99/mo) and Business plan ($199/mo) have unlimited seats — the entire engineering org shares one compression quota with no per-seat fees. The Free plan is capped at 1 seat; the Pro plan is also 1 seat. Enterprise Dedicated is also unlimited seats.

Plans and pricing

Monthly plans for the
MCP compression gateway.

Name: gotcontext MCP Compression Gateway
Brand: gotcontext.ai

A compression is one POST /v1/compress call (or one MCP tool call that wraps it), capped at the per-tier document size below. Five plans: from a free developer tier through reserved-capacity Enterprise. All MCP tools included on every paid plan.

Prices in USD. Annual contracts on Business and above. Procurement artifacts, sub-processor list, and DPA are in Procurement below.

What compression saves in practice

Pro · $49/mo

~$375/mo in token cost

Claude Sonnet 4 at $3/MTok input

at 50% avg reduction, 5K avg doc

Team · $99/mo

~$750/mo in token cost

Claude Sonnet 4 at $3/MTok input

pooled across unlimited seats

Business · $199/mo

~$3,750/mo in token cost

Claude Sonnet 4 at $3/MTok input

metered overage past 500K limit

Enterprise Dedicated

Custom ROI projection

Any model, any scale

we model your traffic, you see the delta

Estimates use a conservative 50% typical reduction (see live data at /v1/global-savings ) × Claude Sonnet 4 list rate ($3/MTok input) × 5K avg token doc. Your actual savings depend on doc mix and model. Use the calculator below to project your volume.

Tier estimator

What does your usage cost?

Drag the slider to your expected monthly compression volume. We’ll recommend a tier and project your monthly cost. Real reduction depends on document mix; the calculator gives the floor.

Monthly compressions50K

5005M

Recommended tier: Pro
Effective monthly cost: $49/mo
Tier limits: Up to 50K compressions / month · 1 MB max doc size · 30 days payload retention · 1 seat.

Start Pro

Numbers are calculator estimates. Real reduction depends on document mix, fidelity, and downstream model.

Free: 1 seatPro: 1 seatTeam, Business, Enterprise Dedicated: unlimited seats, flat price

Free

Solo dev · try before you buy

$0/mo

For individuals validating semantic compression on their own inputs.

Start free

Works with Claude Code, Cursor, and any MCP client

Included

1,000 compressions / month (hard-stop at 1,200)
100 KB max document size
1 concurrent compression slot
17 core MCP tools (compression + filter_cli + search_semantic)
14-day payload retention
Community support · no SLA

Pro

Individual developer · solo AI engineering

$49/mo

For individual developers running 50k context compressions per month.

Start 14-day free trial

14 days free, then $49/mo. Cancel anytime: one click, no questions.

Included

50,000 compressions / month (hard-stop at 60,000)
1 MB max document size
2 concurrent compression slots · 60 req/min rate limit
All MCP tools (compression, memory, code analysis, multimodal, ACE)
Accelerated ONNX embedding tier (3-5× throughput)
30-day payload retention
Email support · 2 business-day first response · no SLA
Card payment · 14-day refund on monthly · annual saves 20%

Team

Recommended for 10 to 50 engineers

$99/mo

Shared compression budget across your engineering org. Unlimited seats, no per-seat add-ons.

Get Team: $99/mo

Included

100,000 compressions / month, pooled across unlimited seats
5 MB max document size · hard-stop at 120,000
4 concurrent compression slots · 300 req/min rate limit
All MCP tools + async batch queue + compression projects
RBAC roles: owner, admin, operator, viewer
GitHub integration + advanced analytics + CSV export
90-day payload retention
Email support · 1 business-day first response · no SLA
Card + ACH payment · 14-day refund on monthly · annual saves 20%

Business

Growth-stage company · compliance + self-hosted

$199/mo

Shared infrastructure. Unlimited seats. 99.5% SLA. Metered overage past 500k. Includes SSO (SAML 2.0 + OIDC), audit-log export, DPA, and self-hosted Docker.

Get Business: $199/mo

Included

500,000 compressions / month pooled · metered overage $0.50 per 1,000 (auto-billed)
10 MB max document size
8 concurrent compression slots · 500 req/min rate limit
SBERT embedding tier (highest semantic fidelity)
Self-hosted Docker: data plane in your VPC (= BYOK answer)
SSO via SAML 2.0 + OIDC (Okta, Entra ID, Auth0, Keycloak)
Audit-log export (NDJSON + CSV)
1-year payload retention · zero-retention mode in self-hosted
99.5% monthly SLA with 10/25/50% credit schedule
Priority email + Slack-connect · 1 BD first response
Card + ACH + Wire + PO + Invoice · annual invoicing · DPA + IP indemnity + custom MSA

Enterprise Dedicated

Fortune 500 · reserved capacity · single-tenant

$499/mo

Reserved capacity pool. Your traffic never shares a process with another customer.

Talk to sales: from $499/mo

Included

Everything in Business
Single-tenant capacity (4-8 dedicated nodes, your VPC region of choice)
20+ guaranteed concurrent compressions at peak
No noisy-neighbor: your queue is isolated from shared traffic
Custom rate limits, no /v1/compress throughput cap
Configurable payload retention · zero-retention mode available
Data residency available on request (US, EU on roadmap H2 2026)
99.9% monthly SLA · custom credit schedule · status.gotcontext.ai
Dedicated channel + named CSM · 4h P1 first response
Quarterly business review with usage + capacity plan
On roadmap (H2 2026): BYOK / SCIM provisioning / EU + APAC region

Plans differ on volume and fidelity, not capability. All 155 MCP tools ship on every paid plan: compression, semantic memory, code analysis, multimodal, and ACE workflows.

View all 155 tools →

How it compares

Why not just run LLMLingua?

The obvious question. LLMLingua is free and open source. The table below covers what you give up when you self-host vs using a managed MCP gateway.

Comparison: gotcontext vs LLMLingua, Langfuse, and per-token APIs (Cohere/Voyage)
Dimension	gotcontext	LLMLingua (OSS)	Langfuse ($0 to $29)	Cohere/Voyage Compact
MCP gateway built in	✓ Native: Claude Code, Cursor, any MCP client	Build it yourself	Not a compression tool	API call, not MCP-native
Compression engine	Semantic (ONNX + PageRank). Local, no LLM API call.	Prompt-compression (token-level)	No compression; tracing only	Embedding model reranking
Setup time	< 5 min: add MCP server URL to claude_desktop_config.json	Python env, GPU recommended, write integration ↗ LLMLingua docs	~10 min (SDK + API key)	~5 min (API key + write call)
Maintenance burden	Zero: managed infra, version upgrades automatic	Model updates, infra, embedding drift (your ops team)	Low (managed SaaS)	Low (managed SaaS)
Self-hosted option	Business and above: data plane in your VPC	Always self-hosted (that's the product)	$0 self-host or $29/mo cloud	Cloud API only
Pricing model	Per-compression flat (not per-token): predictable at scale	Free (your infra cost)	Free tier / $29 team	Per 1M tokens (variable)

LLMLingua and Langfuse are open source projects we respect. This comparison reflects their architectures, not a claim of superiority. Choose what matches your deployment model and team capacity.

Feature comparison by tier

Limits, support, and security across all five plans.

Feature comparison by tier: Free, Pro, Team, Business, and Enterprise Dedicated
Specification	Free	Pro	Team	Business	Enterprise Dedicated
Compression
Monthly compressions	1,000	50,000	100,000	500,000	Unlimited (within capacity pool)
Max document size	100 KB	1 MB	5 MB	10 MB	Custom (negotiable)
Overage policy	Hard-stop	Hard-stop at 60K	Hard-stop at 120K	Metered $0.50 / 1K	Contractual
Batch ingestion	—	Included	Included	Included	Included
Async batch queue	—	—	Included	Included	Included
Compression projects	—	—	Included	Included	Included
Seats, retention, SLA
Seats	1	1	Unlimited (pooled quota)	Unlimited (pooled quota)	Unlimited
Payload retention	14 days	30 days	90 days	1 year	Configurable + zero-retention mode
Monthly uptime SLA	—	—	—	99.5%	99.9%
Service-credit schedule	—	—	—	10 / 25 / 50%	Custom terms
Status page	—	—	—	status.gotcontext.ai	status.gotcontext.ai + custom
Embeddings
Standard compression	Included	Included	Included	Included	Included
Accelerated compression (3-5x faster)	—	Included	Included	Included	Included
Custom embedding models	—	—	—	—	Included
Security & control
API key management	Included	Included	Included	Included	Included
API rate limit	10 req/min	60 req/min	300 req/min	500 req/min	Custom
MCP Server tool access	17 core compression tools	All MCP tools	All MCP tools	All MCP tools	All MCP tools
Fidelity Profiles	Included	Included	Included	Included	Included
Prompt Cache Audit	—	Included	Included	Included	Included
Advanced analytics & CSV export	—	—	Included	Included	Included
Teams	—	—	Included	Included	Included
Webhooks	—	Included	Included	Included	Included
Audit-log export (NDJSON/CSV)	—	—	—	Included	Included
SSO via SAML 2.0 + OIDC	—	—	—	Included	Included
Self-hosted Docker (data plane in your VPC)	—	—	—	Included	Included
BYOK (via self-hosted = your VPC)	—	—	—	Included	Included
SCIM provisioning	—	—	—	—	On roadmap H2 2026
Customer-managed encryption keys (cloud)	—	—	—	—	On roadmap H2 2026
Data residency (US / EU / APAC)	US	US	US	US	US (EU + APAC on roadmap)
DPA, IP indemnity, custom MSA	—	—	—	Included	Included
Support & billing
Support	Community	Email · 2 BD	Email · 1 BD	Priority email + Slack-connect	Dedicated channel + named CSM
P1 first response	—	—	—	1 business day	4 hours
Payment methods	—	Card	Card + ACH	Card + ACH + Wire + PO + Invoice	Custom (invoice / PO / ACH / Wire)
Annual discount	—	20% (2.4 months free)	20% (2.4 months free)	Annual invoice only	Custom contract
Refund policy	—	14-day money-back	14-day money-back	Annual prorated within 30 days	Per contract
Platform
Command Palette (Cmd+K)	Included	Included	Included	Included	Included
Activity Feed	Included	Included	Included	Included	Included
Dark/Light Theme	Included	Included	Included	Included	Included
CSV Export	Included	Included	Included	Included	Included
Queue Monitor (real-time SSE)	—	Included	Included	Included	Included
Webhook Notifications	—	Included	Included	Included	Included
Usage Analytics	—	Included	Included	Included	Included
GitHub Integration	—	—	Included	Included	Included
RBAC Roles	—	—	Included	Included	Included
Shared Projects	—	—	Included	Included	Included
MCP Tool Compression	—	—	Included	Included	Included
SSO / SAML	—	—	—	Included	Included
Audit Trail	—	—	—	Included	Included
Dedicated Support	—	—	—	Included	Included
Custom Integrations	—	—	—	Included	Included

Savings

Project your savings.

The ~50% figure on the landing hero is a conservative typical saving (live data at/v1/global-savingsruns higher), useful as a directional signal, not a projection of your savings. Real reduction depends on your document mix, fidelity choice, and downstream model. Per-model breakdowns (Opus 4.7 vs Gemini Flash vs GPT-5.5) live at /savings-by-model.

Want a number for your own traffic? Use the tier estimator above to project your monthly cost by compression volume, or contact us with 7 days of usage data and we’ll model the monthly delta against your raw token cost across any model.

Compliance and procurement

Built for procurement review

The artifacts a Fortune-500 vendor risk team will ask for, ready before the call.

SOC 2
Type I in progress, target Q3 2026. Not yet certified; stated honestly.
View page
DPA
Available on request, emailed within one business day; GDPR Art. 28 conformant.
View page
Sub-processors
Cloudflare · Fly · Supabase · Upstash · Clerk · Polar · Resend · Sentry · PostHog. Full list with 30-day change notice.
View page
Self-hosted Docker
Business and Enterprise Dedicated. Data plane in your VPC; control plane SaaS. Operates as the BYOK answer.
Audit log export
NDJSON + CSV; 90-day retention on Business, configurable on Enterprise Dedicated.
Status page
status.gotcontext.ai with 90-day rolling uptime per component. Required reading before signing any SLA tier.
Open
Liability cap
Negotiable on annual contracts; default capped at 12 months of fees in MSA template.
On roadmap (H2 2026)
SCIM provisioning · cloud BYOK / CMEK · EU + APAC data residency · SOC 2 Type II close.

Frequently asked questions

Answers before the call

Anything not covered here? Use the contact form below.

One compression is one POST /v1/compress request (or one MCP tool call that wraps it), capped at the per-tier document size: 100 KB Free, 1 MB Pro, 5 MB Team, 10 MB Business, custom on Enterprise Dedicated. A single 30 KB design doc, a 500 KB GitHub diff, and a 2 MB transcript all count as one compression each, regardless of how many tokens are saved.

gotcontext chunks the document, embeds each chunk with a local ONNX sentence-transformer (no external embedding API call), builds a similarity graph, scores nodes with PageRank, and returns the top-K chunks in document order as a compressed skeleton. Token-reduction depends heavily on document size, fidelity setting, and embedding tier. See /v1/global-savings for the live rolling average across production traffic and use the Savings section below to project against your own usage.

No. gotcontext is a compression layer. We don’t fine-tune or train any models on customer inputs. The compression engine is deterministic ONNX inference plus PageRank; there is no learned component that sees your text. Inputs to /v1/compress are processed in-memory; payloads are not retained beyond the request-response lifecycle (Business/Enterprise: zero-retention mode in self-hosted Docker keeps prompts entirely inside your VPC).

Free, Pro, and Team are best-effort with no contractual SLA, which matches industry norm for self-serve tiers (Vercel Pro, Cloudflare Workers, Supabase Pro: no SLA either). Business carries a 99.5% monthly uptime SLA with a 10% / 25% / 50% service-credit schedule (claim-based, capped at one month of fees). Enterprise Dedicated has a 99.9% SLA with custom credit terms. A public status page with 90-day uptime history runs at status.gotcontext.ai. Required reading before signing for any tier with an SLA.

Contact sales

Enterprise volume and self-hosted

Compliance reviews, custom SLAs, dedicated capacity, on-prem deployments. Tell us about your use case and we’ll respond within one business day.

Monthly plans for the
MCP compression gateway.

What does your usage cost?

Why not just run LLMLingua?

Feature comparison by tier

Project your savings.

Built for procurement review

Answers before the call

Enterprise volume and self-hosted

Who you are

What you need

Anything else

Monthly plans for the MCP compression gateway.

What does your usage cost?

Why not just run LLMLingua?

Feature comparison by tier

Project your savings.

Built for procurement review

Answers before the call

Enterprise volume and self-hosted

Who you are

What you need

Anything else

Monthly plans for the
MCP compression gateway.