Skip to main content

Lightweight cost-attribution wrapper for Anthropic, OpenAI, and Google Gemini Python SDKs

Project description

affixly-surge-sdk

Lightweight cost-attribution wrapper for the Anthropic, OpenAI, and Google Gemini Python SDKs. Track AI spend by product line, feature, and customer with a one-line import change — no proxy, no infrastructure, no code rewrite.

PyPI distribution name: affixly-surge-sdk. Python import name: surge_sdk. They differ because surge-sdk was already taken on PyPI by an unrelated project — the import name we control stays clean.

Using Node.js / TypeScript? See affixly-surge-sdk on npm — same interface, same event shape (source).

Install

pip install affixly-surge-sdk

Install alongside whichever provider SDK you use:

pip install "affixly-surge-sdk[anthropic]"   # Anthropic (Claude)
pip install "affixly-surge-sdk[openai]"      # OpenAI (GPT)
pip install "affixly-surge-sdk[gemini]"      # Google Gemini
pip install "affixly-surge-sdk[all]"         # All three

Quick start

from surge_sdk import anthropic, configure

configure(
    surge_api_url="https://your-surge-backend-url",
    surge_api_key="surge_sk_your_key_here",
    product_line="my-app",
)

client = anthropic.Anthropic(api_key="sk-ant-...")
response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello"}],
)
# Tracked automatically. No further code changes needed.

Get your surge_api_key from your Surge dashboard at Settings → SDK → Generate API key.

Per-call tags

Attribute spend to a specific feature or customer:

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[...],
    surge_tags={"feature": "summarize", "customer_id": "cust_abc123"},
)

Model overrides

Redirect calls to a different model than the call site declares — useful for multi-tenant plan tiering (Starter → Haiku, Business → Opus) without touching every call site:

# Global rule — applies to every call the SDK intercepts
configure(
    surge_api_url="...",
    model_overrides={
        "claude-opus-4-6": "claude-sonnet-4-6",   # all Opus calls become Sonnet
    },
)

# Per-call rule — wins over the global map
response = client.messages.create(
    model="claude-opus-4-6",                  # intent declared in code
    max_tokens=1024,
    messages=[...],
    surge_model=get_tenant_model(tenant_id),  # runtime tier resolution
    surge_tags={"feature": "chat", "customer_id": str(tenant_id)},
)

The dashboard logs both the requested and actual model on every override, plus a "Savings from model overrides" card showing the cost delta over time.

How it works

  • The wrapper intercepts messages.create() (or the equivalent for OpenAI / Gemini), reads token counts from the response, and POSTs a usage event to your Surge backend on a background thread.
  • Your AI calls go directly to the provider — no proxy, no added latency.
  • If Surge is unreachable, the report is dropped silently. Your application is never affected.

Supported providers

Provider Import What's tracked
Anthropic from surge_sdk import anthropic messages.create(), messages.create(stream=True), messages.stream() (context manager)
OpenAI from surge_sdk import openai chat.completions.create(), chat.completions.create(stream=True)
Google Gemini from surge_sdk import gemini as genai models.generate_content(), models.generate_content_stream()

Both sync and async clients are supported for all providers (Anthropic and OpenAI; Gemini sync-only matches the upstream SDK's wrapping surface).

Streaming note for OpenAI: the SDK forces stream_options.include_usage=true on streaming calls so the final chunk carries cumulative usage. Callers iterating raw chunks will see one extra final chunk with usage populated — same shape as if you'd set it yourself.

Documentation

Full guide: see docs/getting-started.md.

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

affixly_surge_sdk-0.4.0.tar.gz (12.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

affixly_surge_sdk-0.4.0-py3-none-any.whl (16.6 kB view details)

Uploaded Python 3

File details

Details for the file affixly_surge_sdk-0.4.0.tar.gz.

File metadata

  • Download URL: affixly_surge_sdk-0.4.0.tar.gz
  • Upload date:
  • Size: 12.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for affixly_surge_sdk-0.4.0.tar.gz
Algorithm Hash digest
SHA256 36f314dd7d4958e26acbff6184c10d41a0dc684e6bb06d98a00ef5283f640214
MD5 513249c48169e0602cc0380445dcc41f
BLAKE2b-256 bf7798bdf4e71cc771998974e7a14fdd343d4ec119c59ed2720d04b712d87c81

See more details on using hashes here.

File details

Details for the file affixly_surge_sdk-0.4.0-py3-none-any.whl.

File metadata

File hashes

Hashes for affixly_surge_sdk-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1ab786be0692f64a9c3ef562cc868b50295a00b10f44ee64ba1a0b182bfd77e6
MD5 ff84b4af3fb89750be786f55a2371850
BLAKE2b-256 f87db42f36735367d3eb0e703362cc51c8c2066140c8049d45813dd8f4ebdf78

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page