Compression#
/v1/compressCompress any text document using graph-based semantic compression. Achieves 80 to 95% token reduction on medium-to-large documents. Optionally supply a query to guide the compressor toward sections most relevant to your question.
Request body
{
"text": string, // required — document to compress (min 1 char)
"fidelity": string, // optional — "abstract" | "outline" | "balanced" | "detailed" | "raw"
// default: "balanced"
"query": string|null, // optional — query-guided mode; prioritises relevant sections
"cost_model": string|null // optional — model name for cost estimate (e.g. "claude-opus-4")
}Response
{
"compressed": string, // compressed skeleton text
"stats": {
"original_tokens": number,
"compressed_tokens": number,
"savings_pct": number, // e.g. 87.4
"compression_ratio": number, // e.g. 7.9
"estimated_cost_saved": string|null // e.g. "$0.042" — only when cost_model supplied
}
}curl -X POST https://api.gotcontext.ai/v1/compress \
-H "Authorization: Bearer gc_your_key_here" \
-H "Content-Type: application/json" \
-d '{
"text": "Transformer models fundamentally changed NLP...",
"fidelity": "balanced",
"query": "attention mechanism",
"cost_model": "claude-sonnet-4-6"
}'Error responses
Code Compression#
/v1/compress-codeAST-aware code compression. Parses function/class boundaries, extracts imports and docstrings, ranks symbols by PageRank on the dependency graph. Returns a skeleton preserving signatures and docstrings. Significantly better than plain text compression for code.
Request body
{
"code": string, // required — source code to compress (min 1 char)
"language": string|null, // optional — hint: "python"|"javascript"|"typescript"|"java"|"go"|"rust"|"cpp"
// auto-detected from content when omitted
"fidelity": string, // optional — same levels as /compress, default: "balanced"
}Response
{
"compressed": string,
"stats": {
"original_tokens": number,
"compressed_tokens": number,
"savings_pct": number,
"language_detected": string // e.g. "python", "javascript", "unknown"
}
}curl -X POST https://api.gotcontext.ai/v1/compress-code \
-H "Authorization: Bearer gc_your_key_here" \
-H "Content-Type: application/json" \
-d '{
"code": "def process(items):\n ...",
"language": "python",
"fidelity": "balanced"
}'Error responses
Code Context Ranking (blast-radius + BM25)v1.5.0#
/v1/compress-code/structuralStructural code-context compression. Submit a file bundle + optional focus symbol; the server runs tensor-grep blast-radius + BM25 on the sandboxed files and returns a Reciprocal-Rank-Fusion-ranked context list. Intended for PR-diff-scale code payloads (≤1000 files, ≤512 KB each, ≤5 MB total). Measured 34% token reduction on a 10-file corpus with focus_symbol=cache_lookup vs naive full-bundle submission. See the smoke benchmark at benchmarks/blast_radius_smoke.py.
Request body
{
"files": [
{ "path": "src/app.py", "content": "def handle_request(): ..." },
{ "path": "src/utils.py", "content": "..." }
],
"focus_symbol": "handle_request", // optional — focus blast-radius on this symbol
"query": "error handling", // optional — BM25 query (defaults to focus_symbol)
"top_k": 25 // optional — cap on ranked_context length (1-500, default 50)
}Response
{
"ranked_context": [
{
"path": "src/app.py",
"score": 0.031,
"rank": 1,
"contributing_signals": ["bm25", "graph_distance"]
}
],
"stats": {
"files_in": 10,
"files_ranked": 5,
"symbols_in": 23,
"degraded": false
},
"message": null // non-null only on degraded paths (tg missing, timeout, etc.)
}curl -X POST https://api.gotcontext.ai/v1/compress-code/structural \
-H "Authorization: Bearer gc_your_key_here" \
-H "Content-Type: application/json" \
-d '{
"files": [
{"path":"src/app.py","content":"def handle_request(): pass"},
{"path":"src/utils.py","content":"..."}
],
"focus_symbol": "handle_request",
"top_k": 25
}'Error responses
Batch Compression (synchronous)#
/v1/batch-compressCompress up to 50 documents in a single call. Documents are processed concurrently (max 4 at once to avoid saturating the embedding model). Each document may have its own fidelity and query. Failed documents are reported inline. The overall batch always returns 200.
Request body
{
"documents": [ // required — 1 to 50 items
{
"text": string, // required
"fidelity": string, // optional, default "balanced"
"query": string|null // optional
}
]
}Response
{
"results": [
{
"compressed": string,
"original_tokens": number,
"compressed_tokens": number,
"savings_pct": number,
"compression_ratio": number,
"error": string|null // set when this document failed; other fields are 0
}
],
"summary": {
"total_documents": number,
"successful": number,
"failed": number,
"total_tokens_in": number,
"total_tokens_saved": number,
"avg_savings_pct": number,
"avg_compression_ratio": number
}
}curl -X POST https://api.gotcontext.ai/v1/batch-compress \
-H "Authorization: Bearer gc_your_key_here" \
-H "Content-Type: application/json" \
-d '{
"documents": [
{"text": "First document...", "fidelity": "balanced"},
{"text": "Second document...", "query": "neural networks"},
{"text": "Third document...", "fidelity": "outline"}
]
}'Error responses
Fidelity Advisor#
/v1/recommendAnalyse a document and recommend the optimal fidelity level. Considers document size and (optionally) the target model's context window. Use this to automatically pick the right compression level before calling /compress.
Request body
{
"text": string, // required — document to analyse
"model": string|null, // optional — target model (e.g. "claude-sonnet-4-6")
"context_window": number|null // optional — override context window size in tokens
}Response
{
"recommended_fidelity": string, // e.g. "balanced"
"estimated_ratio": number, // fraction of tokens kept (0.0–1.0)
"estimated_output_tokens": number,
"original_tokens": number,
"reasoning": string // human-readable explanation
}curl -X POST https://api.gotcontext.ai/v1/recommend \
-H "Authorization: Bearer gc_your_key_here" \
-H "Content-Type: application/json" \
-d '{
"text": "Your long document...",
"model": "claude-sonnet-4-6"
}'