Skip to main content

Python SDK for the gotcontext.ai semantic compression API

Project description

gotcontext

Python SDK for the gotcontext.ai semantic compression API.

Reduce LLM token usage by compressing text and code before sending it to language models.

Installation

pip install gotcontext

Quick start

from gotcontext import GotContext

gc = GotContext(api_key="gc_live_...")

# Compress text
result = gc.compress("Your long document text here...", fidelity="balanced")
print(result.compressed)
print(f"Saved {result.tokens_saved} tokens ({result.savings_pct}%)")

Text compression

result = gc.compress(
    "Your long text here...",
    fidelity="balanced",       # abstract | outline | balanced | detailed | raw
    query="key findings",      # optional: prioritise sections relevant to query
    cost_model="claude-sonnet-4-6",  # optional: estimate cost savings
)

print(result.compressed)                # compressed text
print(result.stats.original_tokens)     # original token count
print(result.stats.compressed_tokens)   # compressed token count
print(result.stats.savings_pct)         # percentage saved
print(result.stats.compression_ratio)   # compression ratio

Code compression

result = gc.compress_code(
    code="def hello():\n    print('Hello, world!')\n",
    language="python",   # optional: auto-detected when omitted
    fidelity="balanced",
)

print(result.compressed)
print(result.stats.language_detected)

Batch compression

Compress up to 50 documents in a single call:

result = gc.batch_compress([
    {"text": "First document...", "fidelity": "balanced"},
    {"text": "Second document...", "fidelity": "outline", "query": "key metrics"},
    {"text": "Third document...", "fidelity": "abstract"},
])

for item in result.results:
    if item.error:
        print(f"Failed: {item.error}")
    else:
        print(f"Saved {item.savings_pct}%")

print(f"Total saved: {result.summary.total_tokens_saved} tokens")

Usage stats

usage = gc.get_usage()
print(f"{usage.compressions_used}/{usage.compressions_limit} compressions used")
print(f"{usage.tokens_saved:,} tokens saved this month")

Compression history

events = gc.get_usage_events(page=1, page_size=10)
for event in events.events:
    print(f"{event.created_at}: {event.tokens_saved} tokens saved ({event.fidelity})")

Passing model attribution (MCP)

When calling the gotcontext MCP gateway via the official MCP Python SDK, pass your caller model name in _meta.model so the billing dashboard can attribute per-model cost savings. The meta_for_call helper builds that payload for you:

from gotcontext.mcp_helpers import meta_for_call

result = await session.call_tool(
    "ingest_context",
    {"text": doc, "file_id": "doc-1"},
    meta=meta_for_call(model="claude-opus-4.6"),
)

When model is unknown to the server, it falls back to the resolver chain (api_key.default_model -> plan heuristic). See docs/model-attribution.md for the full resolution chain.

Anthropic prompt cache (cache_breakpoints)

The /v1/compress response includes a cache_breakpoints array describing where Anthropic's prompt cache should be anchored. The apply_anthropic_breakpoints helper stamps the right cache_control marker on the Anthropic messages payload for you -- zero dependencies, non-mutating:

from gotcontext import GotContext, apply_anthropic_breakpoints

gc = GotContext(api_key="gc_live_...")
compressed = gc.compress(long_doc, fidelity="balanced")

messages = [
    {"role": "user", "content": [{"type": "text", "text": compressed.compressed}]},
    {"role": "user", "content": [{"type": "text", "text": user_question}]},
]
messages = apply_anthropic_breakpoints(
    messages=messages,
    breakpoints=compressed.cache_breakpoints,
)
# Pass ``messages`` straight into ``anthropic_client.messages.create(...)``.

Async client

import asyncio
from gotcontext import AsyncGotContext

async def main():
    async with AsyncGotContext(api_key="gc_live_...") as gc:
        result = await gc.compress("Your long text here...")
        print(f"Saved {result.tokens_saved} tokens")

asyncio.run(main())

Error handling

from gotcontext import GotContext, AuthError, RateLimitError, ValidationError

gc = GotContext(api_key="gc_live_...")

try:
    result = gc.compress("Hello")
except AuthError:
    print("Invalid API key")
except RateLimitError as e:
    print(f"Rate limited. Retry after {e.retry_after}s")
except ValidationError as e:
    print(f"Invalid request: {e}")

All errors include the request_id from response headers for support debugging:

except GotContextError as e:
    print(f"Error: {e}")
    print(f"Status: {e.status_code}")
    print(f"Request ID: {e.request_id}")

Configuration

gc = GotContext(
    api_key="gc_live_...",
    base_url="https://api.gotcontext.ai",  # default
    timeout=30.0,                           # request timeout in seconds
    max_retries=3,                          # retries for 429/5xx errors
)

The client automatically retries on rate-limit (429) and server errors (5xx) with exponential backoff. The Retry-After header is respected when present.

Context manager

Both clients support context managers for clean resource cleanup:

with GotContext(api_key="gc_live_...") as gc:
    result = gc.compress("text")
# Connection pool closed automatically

Fidelity levels

Level Compression Use case
abstract ~95% Maximum compression, key points only
outline ~90% High compression, structure preserved
balanced ~85% Default -- good balance of detail and savings
detailed ~60% Light compression, most detail preserved
raw 0% No compression, pass-through

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gotcontext-0.6.1.tar.gz (20.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gotcontext-0.6.1-py3-none-any.whl (21.9 kB view details)

Uploaded Python 3

File details

Details for the file gotcontext-0.6.1.tar.gz.

File metadata

  • Download URL: gotcontext-0.6.1.tar.gz
  • Upload date:
  • Size: 20.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gotcontext-0.6.1.tar.gz
Algorithm Hash digest
SHA256 790cd771b5724d4a3987dd8a9398129c32a377a3e8d53f53e439eb8f28286e78
MD5 ef38cd473665025c384f499fbdb30026
BLAKE2b-256 e420d20d5a2f07c0526efdc7ca78c1e48a808f4d797cdde336988ff53f227e5c

See more details on using hashes here.

Provenance

The following attestation bundles were made for gotcontext-0.6.1.tar.gz:

Publisher: publish-python-sdk.yml on oimiragieo/gotcontext-main

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file gotcontext-0.6.1-py3-none-any.whl.

File metadata

  • Download URL: gotcontext-0.6.1-py3-none-any.whl
  • Upload date:
  • Size: 21.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gotcontext-0.6.1-py3-none-any.whl
Algorithm Hash digest
SHA256 a028d2e2dacf4d74e418eac9e4ee5fa95ec484bf0bb1fa6bda01d2f7777a1f26
MD5 820dc5a2827afebd19fb579db29c0e6e
BLAKE2b-256 d0575bfaf1bc7cd24f5232836e49e11693a2378686e466f4f290a38dcda83f49

See more details on using hashes here.

Provenance

The following attestation bundles were made for gotcontext-0.6.1-py3-none-any.whl:

Publisher: publish-python-sdk.yml on oimiragieo/gotcontext-main

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page