Skip to main content

Thin client and batch execution helpers for OpenAI Responses API workloads.

Project description

tokenrail

CI PyPI Python 3.10+ License: MIT

tokenrail is a small Python library for running OpenAI Responses API jobs with a client.responses.create(...)-style surface.

It focuses on:

  • thread-based OpenAI batch execution
  • client-side RPM / TPM submit throttling
  • per-model token / cost monitoring with ETA progress reporting
  • resumable JSONL and per-request result writing

Fully typed (PEP 561), supports Python 3.10+.

Installation

uv add tokenrail
# or
pip install tokenrail

To track an unreleased revision instead, depend on the Git repository directly:

[tool.uv.sources]
tokenrail = { git = "https://github.com/takumi0shibata/tokenrail", tag = "v1.0.0" }

Set your OpenAI API key in the consuming project before use:

export OPENAI_API_KEY=...

Quick start

from tokenrail import BatchExecutor, ResultsJsonlSink, PerRequestJsonSink, RailClient, RollingMetricsMonitor
from tokenrail.executor import batch_items_from_queries

client = RailClient.openai(max_retries=6)

queries = {
    "1": [{"role": "user", "content": "Summarize this paper in 3 bullets."}],
    "2": [{"role": "user", "content": "Extract the key assumptions."}],
}

items = batch_items_from_queries(
    queries,
    model="gpt-5.4-mini-2026-03-17",
    reasoning_effort="medium",
    verbosity="low",
)

# Consolidate only the necessary elements from all processing results into a single file.
result_sink = ResultsJsonlSink(
    "out/results.jsonl",
    projector=lambda response: {
        "id": response.id,
        "text": response.output_text,
        "model": response.model,
        "usage": response.usage.to_dict(),
    },
)
# Save the raw output of each query.
per_request_sink = PerRequestJsonSink("out/")

executor = BatchExecutor(
    client=client,
    max_workers=16,
    max_rpm=500,
    max_tpm=200_000,
    sinks=[result_sink, per_request_sink],
    monitor=RollingMetricsMonitor(),
)

stats = executor.run(items)
print(stats.to_dict())

Configuration notes

  • max_retries configures the OpenAI Python SDK client's built-in retry behavior. tokenrail does not add its own retry loop on top.
  • max_rpm and max_tpm are optional client-side submit limits. When a limit is set, BatchExecutor waits before submitting more work instead of raising its effective concurrency above the configured rate.
  • Request failures are captured as error records (written to sinks and counted in stats) rather than raised, so one failing item does not abort the batch.
  • base_url is passed through to the OpenAI Python SDK for callers that need an SDK-level custom endpoint.

Resume behavior

BatchExecutor reads completed ids from the first configured sink before it starts. Re-running the same job with the same output path skips records that are already present, then writes only the remaining requests.

If you use a custom projector with ResultsJsonlSink, make sure it keeps an "id" field — resume relies on it.

Cost tracking

  • Costs are estimated from a checked-in per-model pricing table (tokenrail.catalog). Models without a pricing entry get cost=None; prices may lag behind OpenAI's official pricing page, which is always authoritative.
  • OpenAI cost allocation is inferred from billing.payer in the response body. When payer == "openai", the nominal request cost is counted as OpenAI-covered rather than developer-billed.
  • reasoning_effort is gated to gpt-5 / o-series style models in the checked-in capability registry.

Development

uv sync
uv run pytest
uv run ruff check src tests

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tokenrail-1.0.0.tar.gz (52.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tokenrail-1.0.0-py3-none-any.whl (18.3 kB view details)

Uploaded Python 3

File details

Details for the file tokenrail-1.0.0.tar.gz.

File metadata

  • Download URL: tokenrail-1.0.0.tar.gz
  • Upload date:
  • Size: 52.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tokenrail-1.0.0.tar.gz
Algorithm Hash digest
SHA256 72fcc88602a48c2cabec03a0ad8a0795a4e3ef2d790c723fdfe9f206e8ef30d9
MD5 fc07732dd9e14e3304be15545ace025b
BLAKE2b-256 44eb00737bccec0acc6eace6ec6268de3a7db2ed67cbb1aa34de3941eb3c99ab

See more details on using hashes here.

Provenance

The following attestation bundles were made for tokenrail-1.0.0.tar.gz:

Publisher: release.yml on takumi0shibata/tokenrail

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tokenrail-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: tokenrail-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 18.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tokenrail-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 63ee3f1730dc2396ac52f266e9ce019f4ad93ff6442a0b0efde3be200bb88ce1
MD5 96864e31e3729729eacbfcbeab838330
BLAKE2b-256 28770b1f11e90a9a7008d2f451a72917bab99457b9c3f0263f59e18095790d6d

See more details on using hashes here.

Provenance

The following attestation bundles were made for tokenrail-1.0.0-py3-none-any.whl:

Publisher: release.yml on takumi0shibata/tokenrail

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page