Glossary

Blast radius

Pro+

The ranked set of files, symbols, and call chains that a given code change could affect.

The gc_blast_radius MCP tool takes a focus symbol (function name, class, module) and returns a compressed context window containing the symbol's definition, its direct callers, and the caller graph one level out — sorted by relevance score. The intent is to answer "what else do I need to read before touching this?" in a single MCP call instead of N file reads.

Internally, blast radius is computed by tensor-grep (the AST-aware symbol indexer embedded in the API container), then the ranked results are passed through the compression engine so the returned context fits a tight token budget.

Available on Pro+ plans. The companion tool gc_callers returns only the direct caller list without the full ranked context.

BM25

A term-frequency ranking algorithm used alongside semantic embeddings to improve retrieval precision in search and blast-radius queries.

BM25 (Best Match 25) is a bag-of-words relevance function from information retrieval. It scores documents by how often a query term appears relative to document length and corpus-wide term frequency. gotcontext uses BM25 alongside SBERT vector similarity in a hybrid retrieval pipeline: the two scores are fused via Reciprocal Rank Fusion (RRF, k=60) before results are ranked.

BM25 is intentionally scoped per-document at query time to avoid cross-document IDF pollution — a document's term frequencies contribute to its own score only, not to a shared global index. This keeps scores stable as the corpus grows.

In the MCP tool catalogue, BM25 contributes to search_semantic and gc_blast_radius result ranking.

Fidelity

A 1–5 scale controlling how aggressively the engine compresses. Higher fidelity = more detail kept, fewer tokens saved.

Fidelity controls the quality–compression trade-off:

LevelNameWhat it preserves

1skeletonStructure only — headings, function signatures, class names. Bodies and prose stripped.

2outlineSignatures + short docstrings. Good for architecture review.

3standardDefault. Signatures + full docstrings + key prose paragraphs.

4detailedMost content retained; only boilerplate and repetitive spans removed.

5verbatimMinimal compression — used for content where every word matters (e.g. legal text).

Pass fidelity as an integer in the request body, or use the string name as an alias. The POST /v1/recommend endpoint returns the optimal level for a given input and downstream model.

gc_ API key

A bearer token that authenticates REST and MCP requests to the gotcontext.ai gateway.

Every request to api.gotcontext.ai must carry an Authorization: Bearer gc_… header. Keys are minted in Settings → API Keys and tied to a specific project. Thegc_ prefix is literal — it distinguishes platform keys from Clerk session JWTs, which are also accepted on the same endpoints.

Keys are HMAC-signed with a server-side secret. Revoking a key via the dashboard or DELETE /v1/keys/:id invalidates it immediately. There is no expiry by default; set one when minting if your workflow requires rotation.

Knowledge Hub (KB)

Pro+

A per-project document store with compressed retrieval — upload files or URLs, then query them with gc_kb_* MCP tools.

The Knowledge Hub lets you build a private, queryable corpus inside a project. Upload files, URLs, or raw text via the dashboard or the gc_kb_ingest MCP tool. The engine chunks, embeds, and stores every document; subsequent queries use hybrid BM25 + vector retrieval against that per-project corpus.

The seven gc_kb_* MCP tools cover the full lifecycle: gc_kb_ingest, gc_kb_query, gc_kb_list, gc_kb_get, gc_kb_edit, gc_kb_diff, and gc_kb_delete.

Query results return raw_text (the original chunk) rather than the compressed skeleton, because compressed text with [HIDDEN] markers is not useful to a model that cannot expand them.

MCP gateway

The Streamable HTTP endpoint at api.gotcontext.ai/mcp that exposes the full tool catalogue over the Model Context Protocol.

The MCP gateway is the primary integration surface for AI coding clients (Claude Code, Cursor, Codex CLI, Gemini CLI). Clients connect by pointing their MCP config at https://api.gotcontext.ai/mcp with an Authorization: Bearer gc_… header. The protocol is JSON-RPC 2.0 over Streamable HTTP (MCP spec 2025-03-26).

The gateway routes every tools/call request through plan gating, usage accounting, and the compression engine, then returns results as MCP TextContent blocks. There is no separate REST call needed from the client.

The tool catalogue has two profiles: core and full. Append ?profile=core to the MCP URL to get the 7-tool essential set (~2 K tokens in tools/list vs ~38 K for full).

Profile (core vs full)

A query parameter on the MCP URL that narrows the tools/list response to either 7 essential tools (core) or the full catalogue (full).

The full MCP tool catalogue contains 179 tools. Sending the full tools/list response to the model on every session costs ~38 K tokens (≈$0.11 at $3/1 M input). The ?profile=core URL parameter caps this to ~2 K tokens by returning only the seven most commonly used tools:

ingest_context
read_skeleton
search_semantic
modulate_region
get_stats
list_documents
delete_document

The default is ?profile=full to preserve backward compatibility with existing integrations. The gotcontext Claude Code plugin ships with ?profile=core so new installs default to lightweight discovery.

Project

Pro+

An isolation boundary for API keys, usage counters, budget alerts, and Knowledge Hub documents.

Every account starts with a "Default" project. On Pro and higher, you can create additional projects to partition usage by product, environment, or customer. Each project gets its own usage counters, budget cap, and alert thresholds independent of other projects on the same account.

API keys are bound to a project at mint time. A key can only record usage against and access Knowledge Hub documents within its assigned project. Legacy keys minted before project support shipped operate under the user-scoped rollup (equivalent to the Default project).

Manage projects at /dashboard/projects.

Semantic compression

Token reduction that preserves meaning by keeping the highest-information spans and collapsing the rest.

Semantic compression is not truncation — it scores every span in the input using ONNX/SBERT embeddings and BM25 term frequency, then writes a skeleton that keeps the high-signal spans verbatim and replaces low-signal spans with compact placeholders or omits them entirely, depending on the chosen fidelity level.

The engine runs locally inside the API container — no calls to OpenAI, Anthropic, or any other provider occur during compression. Latency is sub-second for documents up to ~100 KB.

Effective savings depend on document size and fidelity. Short inputs (<200 tokens) can produce negative savings because the skeleton header adds overhead. The POST /v1/recommend endpoint advises whether compression is worthwhile for a given input.

Skeleton

The compressed output document produced by the engine — a structured digest of the original that fits in fewer tokens.

A skeleton is the output of a compression call. It opens with a === SEMANTIC SKELETON === header that signals to downstream models that the content has been processed, followed by the retained spans in their original structural order. Low-signal spans are replaced with [HIDDEN] markers or dropped entirely based on fidelity.

When using the MCP gateway, the read_skeleton tool fetches a pre-compressed skeleton for a previously ingested document without re-running the compression engine. The ingest_context + read_skeleton pair is the canonical MCP workflow: ingest once, read many times.

Tool-result limit

The maximum character count a single MCP tools/call response will return before the engine strips metadata and paginates.

MCP clients embed tool responses directly into the model context window. An unbounded tool result from a large document would consume the entire context budget. gotcontext enforces a soft limit of 40,000 characters and a hard limit of 49,000 characters per tools/call response (sized to stay under Claude Code's ~50K per-tool cap).

When a response exceeds the soft limit, the engine first strips non-essential metadata to shrink it. If it still exceeds the hard limit, the response is paginated with a continuation token (or truncated proportionally / head-first depending on the configured strategy). These limits are server-side and apply to every tool; there is no per-call argument to lower them.

If a read_skeleton result feels truncated, try search_semantic with a targeted query instead — it returns only the relevant chunks, so the same token budget covers more signal.