Measured savings across 11 LLMs — Claude Opus 4.7 to Gemini Flash.→ See per-model data
Get free API key →
Engineering

gotcontext + rtk: stack two compression layers, kill 90%+ of your token bill

rtk-ai/rtk handles structured command output (git, test, lint). gotcontext handles unstructured text (docs, plans, KB). They sit at different layers of the agent stack and stack to ~90% joint reduction. Here is the math, the 90-second setup for both, and why we are recommending a "competitor".

James Hollingsworth(Contributor)Published 6 min~692 words

Two compression layers that don't compete

If you run an agentic coding loop (Claude Code, Cursor, Codex, Gemini CLI) your tokens come from two distinct sources, and the same compression strategy doesn't fix both.

Source 1: structured command output. Every git status, pytest, tsc, ls, grep your agent runs returns text the LLM has to read. A single 30-minute coding session typically routes ~118K tokens through the Bash tool, and most of it is whitespace, repeated headers, and noise.

Source 2: unstructured documentation. Specs, plans, READMEs, conversation history, ingested files. This is what gets piped into the agent's context window every time you ask "what does this codebase do?"

Compressing each source needs a different mechanism. rtk (rtk-ai/rtk, 44k stars, Apache 2.0) handles source 1. gotcontext handles source 2. Together they kill 90%+ of total token volume.

Where each one operates

rtkgotcontext
DomainStructured command outputArbitrary text
MethodDeterministic per-command filtersSemantic chunking + PageRank importance scoring
ArchitectureLocal CLI hookRemote API + MCP gateway
Where it sitsBetween agent's Bash tool and the shellBetween agent and your KB / docs / specs
LicenseApache 2.0Proprietary (free + paid plans)
Setupbrew install rtk && rtk init -gMCP config + API key
rtk rewrites Bash commands transparently: when your agent calls git status, the hook intercepts and runs rtk git status instead. The agent never sees the rewrite, it just gets compressed output.

gotcontext exposes MCP tools (ingest_context, compress_codebase, gc_kb_query, gc_blast_radius, etc.) that your agent calls explicitly when it needs to compress arbitrary content.

The two systems literally cannot conflict: they intercept different points in the agent's data flow.

The math, joint impact

rtk's published numbers per 30-min Claude Code session (rtk-ai/rtk README):

OperationStandardrtkSavings
git family17,1003,720-78%
Test runners39,0003,900-90%
Lint / build6,0001,200-80%
File reads / search56,00015,200-73%
Total dev commands~118K~24K-80%
gotcontext's typical results on documentation + code corpora:

Document TypeOriginalCompressedSavings
API docs7,2001,440-80%
Source code (500 lines)4,2001,260-70%
Large codebase (50 files)48,0007,200-85%
If your agent burns 50K tokens on commands AND 50K on context per session, raw cost is 100K tokens. With rtk + gotcontext: ~10K + ~7.5K = 17.5K. Roughly an 82% joint reduction, dollar-for-dollar.

Setup: 90 seconds for both

rtk

``bash brew install rtk # or curl install: see rtk-ai.app rtk init -g # installs Claude Code hook # restart Claude Code `

That's it. Your next git status will route through rtk transparently. The hook only fires on Bash tool calls. Read, Grep, Glob builtins don't pass through, so call rtk read / rtk grep directly when you want them compressed.

gotcontext

Add to your Claude Code MCP config (~/.claude/claude_desktop_config.json):

`json { "mcpServers": { "gotcontext": { "url": "https://api.gotcontext.ai/mcp", "headers": { "Authorization": "Bearer gc_live_YOUR_KEY" } } } } `

Get a key at gotcontext.ai/sign-up. Free tier covers 1,000 compressions/month, no card required.

Why we're recommending a "competitor"

rtk and gotcontext sit at different layers of the agent's stack. rtk's hook intercepts the Bash tool; gotcontext serves MCP tools. We can't compress your git status from the API side because we never see it. rtk can't compress your 200KB design doc because it doesn't have a semantic graph engine.

The token-savings space is big enough that two products can serve it without bumping. The thing that hurts users is choosing one and missing the other half. So: install both.

Operational notes

  • rtk is local, gotcontext is remote. rtk runs on your machine; no data leaves. gotcontext requires sending content to our API for compression. If your KB is sensitive, evaluate the data-flow shape per source.
  • rtk's rtk init -g writes hooks to your Claude Code config. Inspect ~/.claude/settings.json` after install if you want to see exactly what changed.
  • gotcontext's free tier is real: 1,000 compressions/month with no credit card. Pro is $49/mo for 50K compressions + 100+ MCP tools.
  • Both are MIT/Apache-2.0 in spirit. rtk's source is fully open; gotcontext is proprietary on the server but client SDKs are MIT.
  • TL;DR

  • rtk kills ~80% of your dev-command token bill (git, test, lint, build, ls, cat, grep)
  • gotcontext kills ~85% of your documentation / context-window token bill
  • They cannot conflict: different layers of the agent stack
  • Install both for ~90% joint reduction
  • 90 seconds total setup
  • Get gotcontext free → · Install rtk →

    Cite this

    Researchers, analysts, or journalists referencing this post can use either format below — both are copyable.

    BibTeXbibtex
    @misc{rtk-companion-token-savings-2026,
      title  = {gotcontext + rtk: stack two compression layers, kill 90%+ of your token bill},
      author = {James Hollingsworth},
      year   = {2026},
      month  = {May},
      url    = {https://www.gotcontext.ai/blog/rtk-companion-token-savings},
      note   = {gotcontext.ai engineering blog.},
    }
    APAtext
    James Hollingsworth. (2026, May 8). gotcontext + rtk: stack two compression layers, kill 90%+ of your token bill. gotcontext.ai. Retrieved from https://www.gotcontext.ai/blog/rtk-companion-token-savings.

    Contribute