Projects#
Organize compression workloads into projects. Each project tracks its own usage stats, making it easy to attribute token savings across teams or applications.
/v1/projectsCreate a compression project.
Request body
{
"name": string, // required — project name (1-100 chars)
"description": string|null // optional — project description
}Response
{
"id": string,
"name": string,
"description": string|null,
"created_at": string,
"stats": { "compressions": 0, "tokens_saved": 0 }
}curl -X POST https://api.gotcontext.ai/v1/projects \
-H "Authorization: Bearer gc_your_key_here" \
-H "Content-Type: application/json" \
-d '{"name": "backend-docs", "description": "API documentation compression"}'Error responses
/v1/projectsList all projects for the authenticated user.
Response
{
"projects": [
{
"id": string,
"name": string,
"description": string|null,
"created_at": string,
"stats": {
"compressions": number,
"tokens_saved": number
}
}
]
}curl https://api.gotcontext.ai/v1/projects \
-H "Authorization: Bearer gc_your_key_here"/v1/projects/{id}Get project detail with usage statistics.
Response
{
"id": string,
"name": string,
"description": string|null,
"created_at": string,
"updated_at": string,
"stats": {
"compressions": number,
"tokens_saved": number,
"avg_savings_pct": number
}
}curl https://api.gotcontext.ai/v1/projects/YOUR_PROJECT_ID \
-H "Authorization: Bearer gc_your_key_here"Error responses
/v1/projects/{id}Update a project's name or description.
Request body
{
"name": string|null, // optional — new name
"description": string|null // optional — new description
}Response
{
"id": string,
"name": string,
"description": string|null,
"updated_at": string
}curl -X PUT https://api.gotcontext.ai/v1/projects/YOUR_PROJECT_ID \
-H "Authorization: Bearer gc_your_key_here" \
-H "Content-Type: application/json" \
-d '{"name": "backend-docs-v2"}'Error responses
/v1/projects/{id}Delete a project. Compression history is retained but unlinked.
Response
{
"success": true,
"id": string
}curl -X DELETE https://api.gotcontext.ai/v1/projects/YOUR_PROJECT_ID \
-H "Authorization: Bearer gc_your_key_here"Error responses
Batch Queue#
Submit large compression jobs asynchronously. The batch queue processes documents in the background and returns results when complete, ideal for bulk ingestion pipelines.
/v1/batch-queueSubmit an async batch compression job. Returns 202 Accepted with a job ID for polling.
Request body
{
"documents": [ // required — 1 to 500 items
{
"text": string, // required
"fidelity": string, // optional, default "balanced"
"query": string|null // optional
}
],
"project_id": string|null, // optional — associate with a project
"webhook_url": string|null // optional — POST results on completion
}Response
{
"job_id": string,
"status": "queued",
"documents_count": number,
"created_at": string
}curl -X POST https://api.gotcontext.ai/v1/batch-queue \
-H "Authorization: Bearer gc_your_key_here" \
-H "Content-Type: application/json" \
-d '{
"documents": [
{"text": "First document..."},
{"text": "Second document...", "fidelity": "outline"}
]
}'Error responses
/v1/batch-queueList batch jobs for the authenticated user.
Response
{
"jobs": [
{
"job_id": string,
"status": "queued" | "processing" | "completed" | "failed",
"documents_count": number,
"completed_count": number,
"created_at": string,
"completed_at": string|null
}
]
}curl https://api.gotcontext.ai/v1/batch-queue \
-H "Authorization: Bearer gc_your_key_here"/v1/batch-queue/{id}Get job status and progress.
Response
{
"job_id": string,
"status": "queued" | "processing" | "completed" | "failed",
"documents_count": number,
"completed_count": number,
"failed_count": number,
"created_at": string,
"completed_at": string|null,
"progress_pct": number // 0.0 - 100.0
}curl https://api.gotcontext.ai/v1/batch-queue/YOUR_JOB_ID \
-H "Authorization: Bearer gc_your_key_here"Error responses
/v1/batch-queue/{id}/resultsRetrieve completed batch results. Only available when status is 'completed'.
Response
{
"job_id": string,
"results": [
{
"compressed": string,
"original_tokens": number,
"compressed_tokens": number,
"savings_pct": number,
"error": string|null
}
],
"summary": {
"total_documents": number,
"successful": number,
"failed": number,
"total_tokens_saved": number,
"avg_savings_pct": number
}
}curl https://api.gotcontext.ai/v1/batch-queue/YOUR_JOB_ID/results \
-H "Authorization: Bearer gc_your_key_here"Error responses
Analytics#
Detailed analytics for compression usage across projects. View per-project breakdowns, track trends over time, and export data for reporting.
/v1/analytics/summaryPer-project usage breakdown for the current billing period.
Response
{
"period": string, // "YYYY-MM"
"total_compressions": number,
"total_tokens_saved": number,
"projects": [
{
"project_id": string,
"project_name": string,
"compressions": number,
"tokens_saved": number,
"avg_savings_pct": number
}
]
}curl https://api.gotcontext.ai/v1/analytics/summary \
-H "Authorization: Bearer gc_your_key_here"Error responses
/v1/analytics/trendsDaily or weekly compression trends. Use query parameters to control the window.
Response
{
"granularity": "daily" | "weekly",
"data": [
{
"date": string, // "YYYY-MM-DD"
"compressions": number,
"tokens_saved": number,
"avg_savings_pct": number
}
]
}curl "https://api.gotcontext.ai/v1/analytics/trends?granularity=daily&days=30" \
-H "Authorization: Bearer gc_your_key_here"Error responses
/v1/analytics/exportExport analytics data as CSV for the specified date range.
Response
Content-Type: text/csv
date,project,compressions,tokens_in,tokens_saved,savings_pct
2026-04-01,backend-docs,142,284000,248000,87.3
2026-04-01,frontend-app,89,178000,151300,85.0
...curl "https://api.gotcontext.ai/v1/analytics/export?start=2026-04-01&end=2026-04-14" \
-H "Authorization: Bearer gc_your_key_here" \
-o analytics.csvError responses
Usage#
/v1/usageMonthly compression statistics for the authenticated user. Returns compression counts, token totals, plan limit, and the next reset timestamp.
Response
{
"period": string, // "YYYY-MM", e.g. "2026-04"
"compressions_used": number,
"compressions_limit": number, // varies by plan — see plan field
"pct_used": number, // 0.0–100.0
"tokens_in": number, // total original tokens this month
"tokens_saved": number, // total tokens eliminated this month
"resets_at": string, // ISO 8601 UTC, midnight 1st of next month
"plan": string, // free | pro | team | enterprise
"rate_limit_per_minute": number // varies by plan
}curl https://api.gotcontext.ai/v1/usage \
-H "Authorization: Bearer gc_your_key_here"Rate Limits#
GET /v1/usage for your current consumption. When you hit the rate limit, the API responds with HTTP 429 and a Retry-After header. Back off for that many seconds before retrying.Prompt-Cache Friendliness Score#
/v1/audit-cacheAudit how cache-friendly a prompt is for a specific AI provider. Returns a cacheability score, whether the prompt is cache-friendly, actionable recommendations to improve cache hit rates, and estimated savings.
Request body
{
"text": string, // required — prompt or document text to audit (min 1 char)
"provider": string // optional — "anthropic" | "openai" | "google"
// default: "anthropic"
}Response
{
"provider": string,
"cache_friendly": boolean,
"score": number, // 0.0 - 1.0 cacheability score
"recommendations": [string], // actionable suggestions
"estimated_savings_pct": number // estimated cache hit savings
}curl -X POST https://api.gotcontext.ai/v1/audit-cache \
-H "Authorization: Bearer gc_your_key_here" \
-H "Content-Type: application/json" \
-d '{
"text": "You are a helpful assistant that...",
"provider": "anthropic"
}'Error responses
Context-Window Utilization Check#
/v1/check-budgetCheck how much of a model's context window a text would consume. Returns token estimates, percentage used, a status indicator (OK / WARNING / CRITICAL), and a recommendation on whether to compress.
Request body
{
"text": string, // required — text to check against budget (min 1 char)
"context_window": number, // optional — target context window in tokens
// default: 200000
"model": string // optional — target model for cost estimation
// default: "claude-opus-4"
}Response
{
"estimated_tokens": number,
"context_window": number,
"pct_used": number, // e.g. 42.5
"status": string, // "OK" | "WARNING" | "CRITICAL"
"recommendation": string // human-readable guidance
}curl -X POST https://api.gotcontext.ai/v1/check-budget \
-H "Authorization: Bearer gc_your_key_here" \
-H "Content-Type: application/json" \
-d '{
"text": "Your long document or codebase...",
"context_window": 200000,
"model": "claude-opus-4"
}'Error responses
Semantic Cache#
Beyond compression we operate a per-account semantic cache: an embedding-similarity index of the last 100 baseline calls. When a new prompt is close enough to a cached one, we return the prior compressed result instead of re-running the pipeline. Additional reduction; not metered against compression quota.
The cache warms up over the first ~100 baseline calls. Typical hit rates after week 1 land in the 15 to 25% range. The per-tenant similarity threshold is tunable via POST /v1/settings/semantic-cache-threshold (Team and Enterprise). Hit telemetry shows up at Billing → Cache-Adjusted Savings.