Official Python SDK for the Graphn API: custom-model lifecycle, secrets, and OpenAI-compatible inference.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

kunwarshivam

These details have not been verified by PyPI

Project links

Project description

graphn — Python SDK

The official Python SDK for Graphn. Import any LLM — from HuggingFace or your own S3 bucket — into your workspace, get an OpenAI-compatible inference endpoint, and call it from Python in a handful of lines, without standing up a single GPU yourself.

v0.1.x scope — This release line covers custom-model import (HuggingFace + S3) and OpenAI-compatible inference end-to-end. A lot of the broader Graphn platform (agents, knowledge bases, workflows, evals, datasets, guardrails, billing, full BYO-inference CRUD, etc.) is not yet exposed through this SDK. Those surfaces will be added in subsequent minor releases as their HTTP APIs stabilize. See Scope below for the exact list.

import graphn

with graphn.Client() as c:
    model = c.custom_models.create(
        name="my-llama",
        huggingface_model_id="Qwen/Qwen3-0.6B",
        weight_source="huggingface",
    )
    c.custom_models.wait_until_ready(model.id)

    resp = c.chat.completions.create(
        model=model.id,
        messages=[{"role": "user", "content": "Hello!"}],
    )
    print(resp.choices[0].message.content)

That's it. Cold-start, retry, OpenAI-compatible serialization, and the gateway's routing prefix are all handled for you — pass the bare model.id everywhere.

Install

pip install graphn

Requires Python 3.10+. Tested on 3.10, 3.11, 3.12, 3.13.

Authentication

The SDK reads credentials from the environment by default:

export GRAPHN_API_KEY=gn_...           # required
export GRAPHN_WORKSPACE_ID=ws_...      # required
export GRAPHN_BASE_URL=https://cp.graphn.ai      # optional
export GRAPHN_INFERENCE_URL=https://model.graphn.ai  # optional

Or pass them explicitly:

client = graphn.Client(api_key="gn_...", workspace_id="ws_...")

Get an API key from the Graphn dashboard.

Scope

What's in the box (v0.1.x)

Module	What it does
`client.custom_models`	Import models from HuggingFace, S3 presigned URLs, or S3 + IAM role; list, get, refresh, wake, delete, `wait_until_ready`, validate
`client.secrets`	CRUD for workspace secrets (HuggingFace tokens, etc)
`client.chat.completions`	OpenAI-compatible chat completions, streaming + non-streaming, with auto-wake on cold start
`client.models`	List models served by the gateway
`client.tts`	Text-to-speech: list voices, synthesize
`client.imported_models`	Discover and probe BYO inference endpoints (read-only, no full CRUD yet)

Both graphn.Client and graphn.AsyncClient exist with identical APIs.

What's not in the box yet

The Graphn platform is broader than what's exposed here. The following surfaces exist on the platform but do not have SDK coverage in v0.1.x — file an issue on the SDK repo to vote on what you need next:

Agents — defining, running, and inspecting agent workflows
Knowledge bases / RAG — corpus management, retrieval, indexing
Workflows — long-running Temporal-backed orchestration
Evals & datasets — eval suites, dataset upload, run results
Guardrails — policy authoring and inference-time enforcement
Imported models (BYO inference) — full CRUD — only listing and probe are exposed today
Usage & billing — usage stats, GPU-hour reporting beyond the read-only client.custom_models.gpu_hours() helper
Workspace / member / API-key administration

Until they land here, those endpoints can be hit via raw HTTP using your gn_... API key. The control plane is documented at graphn.ai/docs/api and the OpenAPI 3.1 spec is mirrored at voltagepark/graphn-openapi.

The 80% recipe: import a model and chat with it

import graphn

with graphn.Client() as c:
    # 1. Import the model. Use a workspace secret for gated HF repos.
    model = c.custom_models.create(
        name="my-llama",
        huggingface_model_id="meta-llama/Llama-3.1-8B-Instruct",
        weight_source="huggingface",
        hf_token_secret_id="sec_...",  # optional, only for gated models
    )

    # 2. Wait for the deployment to be live.
    c.custom_models.wait_until_ready(model.id, timeout=1800)

    # 3. Chat. The first call will cold-start the model — the SDK
    #    transparently calls wake() and retries until it serves.
    resp = c.chat.completions.create(
        model=model.id,
        messages=[{"role": "user", "content": "Tell me a joke."}],
        wake_timeout=600,             # max time to wait for cold start
    )
    print(resp.choices[0].message.content)

Importing from S3

If your weights aren't on HuggingFace — fine-tunes, internal models, licensed checkpoints — import them straight from S3. Two flavors, both of which still require huggingface_model_id (see callout below).

huggingface_model_id is required for S3 imports too. It's the canonical identifier for the model — the name the inference endpoint advertises and the value you pass in model for chat completions. Use the upstream org/model-name your weights are based on (e.g. Qwen/Qwen3-0.6B, meta-llama/Llama-3.1-8B-Instruct). This is the same "Model ID" field the web UI requires for S3 imports. Omitting it raises graphn.ValidationError client-side; passing it but having a mismatched archive will surface as a deploy failure on the model record.

Presigned URL (no AWS credentials shared with Graphn):

model = c.custom_models.create(
    name="my-finetune",
    weight_source="s3_presigned",
    huggingface_model_id="meta-llama/Llama-3.1-8B-Instruct",
    s3_url="https://my-bucket.s3.amazonaws.com/llama-3.1-8b.tar.gz?X-Amz-Algorithm=...",
    gpu_count=1,
)

Package the weights as a single .tar.gz archive whose top level is the model directory (the same layout huggingface-cli download produces). Generate the URL with aws s3 presign s3://my-bucket/llama-3.1-8b.tar.gz or the AWS SDK; Graphn pulls weights through the URL on import. The URL only needs to be live for the import window (allow at least a few minutes for the download), not for the model's lifetime.

IAM role assumption (for buckets you control, longer-lived credentials):

model = c.custom_models.create(
    name="my-finetune",
    weight_source="s3_assume_role",
    huggingface_model_id="meta-llama/Llama-3.1-8B-Instruct",
    s3_url="s3://my-bucket/llama-3.1-8b.tar.gz",
    s3_role_arn="arn:aws:iam::123456789012:role/GraphnImport",
    gpu_count=1,
)

The role's trust policy must allow Graphn's importer principal to sts:AssumeRole; ask support for the principal ARN to put in your trust policy. Graphn re-assumes on every import / refresh, so rotating credentials underneath is safe.

Everything past the create call — wait_until_ready, chat completions, auto-wake, addressing by model.id — is identical regardless of weight source. See examples/import_from_s3.py for an end-to-end runnable script.

Streaming

stream = c.chat.completions.create(
    model=model.id,
    messages=[{"role": "user", "content": "Count to ten."}],
    stream=True,
)
for chunk in stream:
    delta = chunk.choices[0].delta.content
    if delta:
        print(delta, end="", flush=True)

Async

import asyncio
import graphn

async def main() -> None:
    async with graphn.AsyncClient() as c:
        async for m in c.custom_models.list():
            print(m.id, m.name, m.status)

        resp = await c.chat.completions.create(
            model="cm_abc123",
            messages=[{"role": "user", "content": "Hi!"}],
        )
        print(resp.choices[0].message.content)

asyncio.run(main())

Cold starts and auto-wake

Graphn custom models default to scale-to-zero: a model with no traffic for cooldown_seconds is descheduled, and the first request afterwards has to wait for the gateway to spin up a fresh replica (typically 60–600 seconds depending on weight size).

Without help, the first chat request after a cold period returns:

503 Service Unavailable: Model is scaled to zero and is now warming up.

The SDK detects this, calls POST /custom-models/{id}/wake to nudge the autoscaler, and retries with exponential backoff until the model serves or wake_timeout (default 180s) elapses. You don't have to do anything, but the knobs are there if you want them:

# Disable auto-wake — you handle the 503 yourself.
c.chat.completions.create(model=..., messages=[...], auto_wake=False)

# Give the warm-up more headroom (e.g. for large models).
c.chat.completions.create(model=..., messages=[...], wake_timeout=900)

See docs/cold-starts.md for the full story.

Drop-in for `openai`

The chat path is OpenAI-compatible all the way down — under the hood we delegate to the official openai Python SDK, configured against the Graphn gateway. So tools, structured outputs, multi-modal inputs, function calling, etc. all work out of the box.

If you already have OpenAI-shaped code and just want to point it at a Graphn model:

from openai import OpenAI

client = OpenAI(
    api_key="gn_...",
    base_url="https://model.graphn.ai/v1",
    default_headers={"X-Workspace-Id": "ws_..."},
)
resp = client.chat.completions.create(
    model="custom:cm_...",  # raw openai client => you type the prefix
    messages=[{"role": "user", "content": "Hello!"}],
)

The reason to use graphn.Client instead is everything around the chat call: lifecycle management, secrets, auto-wake, bare-cm_ addressing without the wire prefix, typed responses, and a stable URL contract.

More examples

See examples/ for runnable end-to-end scripts:

examples/import_and_chat.py — full lifecycle (HuggingFace)
examples/import_from_s3.py — S3 presigned + assume-role import
examples/streaming.py — streaming chat
examples/async_client.py — async usage
examples/openai_compat.py — drop-in from openai

Configuration reference

Argument	Default	Notes
`api_key`	`$GRAPHN_API_KEY`	Bearer token starting with `gn_`. Required.
`workspace_id`	`$GRAPHN_WORKSPACE_ID`	Path parameter + `X-Workspace-Id` header. Required.
`base_url`	`https://cp.graphn.ai`	Control plane host.
`inference_url`	`https://model.graphn.ai`	Inference / OpenAI-compatible host.
`timeout`	`60.0`	Per-request HTTPX timeout (seconds).
`max_retries`	`2`	Retries on connect failures, 429, and 5xx.
`default_headers`	`{}`	Extra headers added to every request.

Generating clients in other languages

The OpenAPI 3.1 spec is the source of truth. It's published at:

GitHub — voltagepark/graphn-openapi
Live HTML reference — graphn.ai/docs/api
Direct download — https://cp.graphn.ai/openapi.yaml

Point your favorite generator at any of these. We test against openapi-generator 6.0+, openapi-python-client 0.21+, and oapi-codegen 2.0+.

Contributing

git clone https://github.com/voltagepark/graphn-sdk-python
cd graphn-sdk-python
python -m venv .venv && source .venv/bin/activate
pip install -e '.[dev]'

ruff check src tests
pytest

Regenerate the typed transport from the upstream spec after a spec change:

./scripts/regenerate.sh

See CHANGELOG.md for release notes.

License

Apache 2.0 — see LICENSE.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

kunwarshivam

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.4

Apr 25, 2026

0.1.3

Apr 25, 2026

0.1.2

Apr 25, 2026

0.1.1

Apr 25, 2026

0.1.0

Apr 25, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graphn-0.1.4.tar.gz (52.4 kB view details)

Uploaded Apr 25, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

graphn-0.1.4-py3-none-any.whl (105.7 kB view details)

Uploaded Apr 25, 2026 Python 3

File details

Details for the file graphn-0.1.4.tar.gz.

File metadata

Download URL: graphn-0.1.4.tar.gz
Upload date: Apr 25, 2026
Size: 52.4 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for graphn-0.1.4.tar.gz
Algorithm	Hash digest
SHA256	`ba71479beedcd99531feddf8146d745f5ecc9a7b97bdce4de426ee7a98840c5e`
MD5	`12eba632aee1dddc81aa89edad3a1c17`
BLAKE2b-256	`8be88a74aac1e2f25ef2d4d31d727b58c8656f4aff1ea713e882c7ca35a4833b`

See more details on using hashes here.

Provenance

The following attestation bundles were made for graphn-0.1.4.tar.gz:

Publisher: release.yml on voltagepark/graphn-sdk-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: graphn-0.1.4.tar.gz
- Subject digest: ba71479beedcd99531feddf8146d745f5ecc9a7b97bdce4de426ee7a98840c5e
- Sigstore transparency entry: 1384459071
- Sigstore integration time: Apr 25, 2026
Source repository:
- Permalink: voltagepark/graphn-sdk-python@377dcc8b7f2b6b3819f03e0457ea798f88ace4a5
- Branch / Tag: refs/heads/main
- Owner: https://github.com/voltagepark
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@377dcc8b7f2b6b3819f03e0457ea798f88ace4a5
- Trigger Event: push

File details

Details for the file graphn-0.1.4-py3-none-any.whl.

File metadata

Download URL: graphn-0.1.4-py3-none-any.whl
Upload date: Apr 25, 2026
Size: 105.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for graphn-0.1.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7b0e42fb6dd9f179149db1b3e5bc164ffa1ba2115bb7a242c6765cf1a6f25c9e`
MD5	`038ee759d44e24b0867e8fedde79ab90`
BLAKE2b-256	`e6ea185593eeab1d4dee1b954a6bf3b648170686ef3f9a9e1cd2d4673f802c1e`

See more details on using hashes here.

Provenance

The following attestation bundles were made for graphn-0.1.4-py3-none-any.whl:

Publisher: release.yml on voltagepark/graphn-sdk-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: graphn-0.1.4-py3-none-any.whl
- Subject digest: 7b0e42fb6dd9f179149db1b3e5bc164ffa1ba2115bb7a242c6765cf1a6f25c9e
- Sigstore transparency entry: 1384459083
- Sigstore integration time: Apr 25, 2026
Source repository:
- Permalink: voltagepark/graphn-sdk-python@377dcc8b7f2b6b3819f03e0457ea798f88ace4a5
- Branch / Tag: refs/heads/main
- Owner: https://github.com/voltagepark
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@377dcc8b7f2b6b3819f03e0457ea798f88ace4a5
- Trigger Event: push

graphn 0.1.4

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

graphn — Python SDK

Install

Authentication

Scope

What's in the box (v0.1.x)

What's not in the box yet

The 80% recipe: import a model and chat with it

Importing from S3

Streaming

Async

Cold starts and auto-wake

Drop-in for openai

More examples

Configuration reference

Generating clients in other languages

Contributing

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

Drop-in for `openai`