Skip to main content

Thin OpenAI Chat Completions invoker with SQLite caching, usage/cost tracking, and reproducible Markdown dumps.

Project description

GPT Invoker

A practical, self-contained wrapper around the OpenAI Chat Completions API designed for reproducibility and traceability.

  • SQLite-backed response cache keyed by prompt + model arguments
  • Token usage accounting with rough price estimation per model
  • Human-readable markdown dumps for inspection and debugging
  • Optional structured append-only JSON log
  • Convenience helpers to extract code/JSON/Java/invariant blocks from responses

This module focuses on simple, reproducible calls and auditing, not exposing the entire OpenAI surface area.

Table of Contents

  • Features
  • Requirements
  • Installation
  • Environment Setup
  • Quick Start
  • Programmatic Usage
  • Caching
  • Logging and Dumps
  • API Overview
  • Token Usage and Pricing
  • Troubleshooting
  • Testing
  • Packaging and Publishing
  • License

Features

  • Content-addressed cache to avoid repeated network calls for identical inputs
  • Deterministic model-args hashing across max_tokens, temperature, and optional top_p
  • Markdown dump files that capture inputs, outputs, and metadata
  • Compact, append-only JSON lines log for automated analysis
  • Utilities to extract common response structures (code blocks, JSON, Java, invariants)

Requirements

If you don't have uv installed:

# Install uv (Linux/macOS)
curl -LsSf https://astral.sh/uv/install.sh | sh
# Restart your shell or source your profile if needed, then verify
uv --version

Installation

# Recommended one-step: create .venv and install dependencies from pyproject.toml
uv sync

# Optional: ensure a specific Python version via uv, then sync
uv python install 3.10
uv sync --python 3.10

Notes:

  • uv sync creates a local .venv/ and installs dependencies declared in pyproject.toml.
  • You typically do not need to manually activate the venv; use uv run instead.

Environment Setup

This module loads environment variables from a .env file located next to gpt_invoker.py (and also from your shell environment).

Supported variables:

  • GPT_INVOKER_API_KEY (required): API key for the target API host.
  • GPT_INVOKER_API_HOST (optional): Base URL for the API. Default: https://api.openai.com.
  • GPT_INVOKER_DEF_MODEL (optional): Default model if not passed explicitly. Default: gpt-5.

Example .env file:

GPT_INVOKER_API_KEY=sk-...your_api_key...
# GPT_INVOKER_API_HOST=https://api.openai.com
# GPT_INVOKER_DEF_MODEL=gpt-5

Quick Start

Run the included demo to verify your setup:

# Preferred: run without activating; uv finds and uses .venv automatically
uv run python gpt_invoker.py

# Alternatively, activate then run
source .venv/bin/activate
python gpt_invoker.py

It will print the model’s reply and a usage summary.

Programmatic Usage

from gpt_invoker import GPTInvoker

invoker = GPTInvoker(
    # You can also set these via environment variables
    # api_key="...",               # If not provided, read from GPT_INVOKER_API_KEY
    # api_host="https://api.openai.com",
    model="gpt-5",
    max_tokens=1024,
    temperature=0.0,
    read_from_cache=True,
    write_to_cache=True,
)

messages = [
    {"role": "user", "content": "Write a haiku about the sea."},
]

text = invoker.generate(messages)
print(text)

# Token usage and estimated cost (rough)
print(invoker.usage)

Bypass cache for a specific call:

full = invoker.generate_all_res(messages, ignore_cache=True)
print(full.choices[0].message.content)

Streaming text:

streamed_text = invoker.generate_inner_stream(messages)
print(streamed_text)

Extract helpers:

resp = invoker.generate([
    {"role": "user", "content": "Return a JSON object with a title and tags."}
])

maybe_json = invoker.extract_json(resp)  # dict/list parsed from a fenced `json` block
# Other helpers: extract_code, extract_java, extract_invs

Caching

  • Backend: SQLite, located at gpt_cache.db in the same folder by default.
  • Keying: Prompt content + model arguments (including temperature, max_tokens, and optionally top_p).
  • Controls:
    • read_from_cache (bool, default True)
    • write_to_cache (bool, default True)
    • gpt_cache_path (str): custom path to the SQLite file
  • Threading: The connection is opened with check_same_thread depending on sqlite3.threadsafety. A warning is logged if threadsafety is not ideal.

Logging and Dumps

Two logging forms are available:

  1. Markdown dumps (human-readable):

    • Location: gpt_dump/YYYY-MM-DD/HH-<C|N|E>/...
    • C = cache hit, N = fresh network call, E = error
    • Includes input messages, response metadata, and choices content
    • Configure with gpt_dump_folder (set to None to disable)
  2. Structured log (JSONL-like):

    • Location: gpt_log.txt by default
    • Append-only; one JSON object per line with timestamp, inputs, outputs, and flags
    • Controlled by gpt_log_path (set to None to disable)

API Overview

Constructor:

GPTInvoker(
    api_key: str | None = None,
    model: str | None = None,
    top_p: float | None = None,
    max_tokens: int = 4096,
    temperature: float = 0.0,
    api_host: str | None = None,
    gpt_cache_path: str | None = None,
    read_from_cache: bool = True,
    write_to_cache: bool = True,
    gpt_log_path: str | None = None,
    write_gpt_log: bool = True,
    gpt_dump_folder: str | None = None,
    dump_gpt_log: bool = True,
)

Key methods:

  • generate(messages) -> str: Returns the single choice text content; raises if zero/multiple choices or finish reason not stop.
  • generate_all_res(messages, ignore_cache=False) -> ChatCompletion: Returns the full response object, consulting cache unless bypassed.
  • generate_inner(messages) -> ChatCompletion: Raw API call with current model args.
  • generate_inner_stream(messages) -> str: Streams and concatenates content.
  • extract_code(text) -> str: First fenced code block.
  • extract_json(text) -> Any: Parse first fenced json block.
  • extract_java(text) -> str: First fenced java block.
  • extract_invs(text) -> list[str]: All <invariant>...</invariant> blocks.

Token Usage and Pricing

  • invoker.usage tracks prompt/completion tokens and cache-hit tokens separately.
  • Pricing is a rough estimate using a static table embedded in the code. Treat it as indicative only.

Troubleshooting

  • Missing API key: Set GPT_INVOKER_API_KEY or pass api_key to GPTInvoker.
  • Invalid model name: Usage estimates fall back to a default pricing tier with a warning; API calls still use the model you provided.
  • Cache not working: Ensure the process has write permissions to the folder. You can disable caching via read_from_cache=False and/or write_to_cache=False.
  • Threadsafety warning: SQLite may warn if sqlite3.threadsafety != 3. If you use multiple threads, consider one invoker instance per thread or add your own synchronization.
  • Base URL: If targeting a different host or gateway, set GPT_INVOKER_API_HOST.
  • Error diagnostics: Failures are recorded as -E dumps under gpt_dump/ with traceback text. The JSON log also records is_error=True.

Testing

Run the test suite:

uv run pytest -q

Highlights:

  • Unit tests cover cache overwrite behavior, dump logging branches, usage accounting, streaming, and error paths.
  • Coverage reports are generated in htmlcov/ and coverage.xml when configured.

Packaging and Publishing (PyPI)

This project uses a modern pyproject.toml configuration with setuptools. The module is a single file (gpt_invoker.py) exposed via py-modules.

Build (using uv + build backend):

uv build

Artifacts will be in dist/.

Upload to PyPI (requires twine and PyPI credentials):

uv tool install twine  # or: uv run python -m pip install twine
twine upload dist/*

Versioning:

  • Keep project.version in pyproject.toml in sync with __version__ in gpt_invoker.py.
  • Use tags/releases in your VCS as needed.

License

MIT License. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

myxxxsquared_gpt_invoker-0.2.1.tar.gz (21.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

myxxxsquared_gpt_invoker-0.2.1-py3-none-any.whl (13.6 kB view details)

Uploaded Python 3

File details

Details for the file myxxxsquared_gpt_invoker-0.2.1.tar.gz.

File metadata

  • Download URL: myxxxsquared_gpt_invoker-0.2.1.tar.gz
  • Upload date:
  • Size: 21.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for myxxxsquared_gpt_invoker-0.2.1.tar.gz
Algorithm Hash digest
SHA256 306ef2f72fda32e875de9ccbe0082f4c4bd1c8c5f4c5c22040d0c6d7d578ed50
MD5 72d397e84cb4105a1183d282284c92bb
BLAKE2b-256 85970d2c6d9d729c3aaae083ddfd8171cc4f94b434f42aea3a05fcb3f2910378

See more details on using hashes here.

Provenance

The following attestation bundles were made for myxxxsquared_gpt_invoker-0.2.1.tar.gz:

Publisher: publish-pypi.yml on myxxxsquared/myxxxsquared_gpt_invoker

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file myxxxsquared_gpt_invoker-0.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for myxxxsquared_gpt_invoker-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 7e1ebfd821e4f860a16a40d1602a8e225f461bc498c94f43dadd87038c730f87
MD5 ab3c112806f7a361a4a31e2bb039a1f0
BLAKE2b-256 019f67e89e6cce04bcc9b9a1424e14ae41ed464ead7e58a1a2123adc43bda8d5

See more details on using hashes here.

Provenance

The following attestation bundles were made for myxxxsquared_gpt_invoker-0.2.1-py3-none-any.whl:

Publisher: publish-pypi.yml on myxxxsquared/myxxxsquared_gpt_invoker

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page