Skip to main content

Single-endpoint GenAI SDK (multi-provider, multimodal)

Project description

genai-calling

CI Python License

中文文档:README_ZH.md

One interface for calling multimodal models; four ways to use: Skill, MCP, CLI, SDK.

Features

  • Multi-provider: OpenAI, Google (Gemini), Anthropic (Claude), Aliyun (DashScope/Bailian), Volcengine (Doubao/Ark), Tuzi
  • Multimodal: text/image/audio/video input and output (model-dependent)
  • Unified API: a single Client.generate() for all providers
  • Streaming: generate_stream() for incremental output
  • Tool calling: function tools (model/provider-dependent)
  • JSON Schema output: structured output (model/provider-dependent)
  • MCP Server: Streamable HTTP and SSE transport
  • Security: SSRF protection, DNS pinning, download limits, Bearer token auth (MCP)

Installation

pip install genai-calling

Python import package is gravtice.

For development:

pip install -e .
# or (recommended)
uv sync --group dev

Skill (External Repository)

The standalone skill is no longer bundled in this repository.

Preferred install name:

npx skills add gravtice/nous-skills -s genai-calling

Legacy catalogs may still expose the old entry name:

npx skills add gravtice/nous-skills -s nous-genai

Skill repository: https://github.com/gravtice/nous-skills

Configuration (Env Vars, Zero-parameter)

Configuration is managed via environment variables.

You can set env vars in two ways:

  1. Runtime env vars (inline or exported in shell)
  2. Env files (.env.local, .env.production, .env.development, .env.test) and the global fallback ~/.genai-calling/.env

Runtime example (inline):

GENAI_CALLING_OPENAI_API_KEY=... uv run genai --model openai:gpt-4o-mini --prompt "Hello"

When env files are used, SDK/CLI/MCP loads them automatically with priority (high -> low):

.env.local > .env.production > .env.development > .env.test > ~/.genai-calling/.env

Process env vars override both project and global env files (the loader uses os.environ.setdefault()).

Use ~/.genai-calling/.env for user-wide shared defaults such as API keys. Keep worktree-specific settings such as ports in project-local .env.local.

Minimal .env.local (OpenAI only):

GENAI_CALLING_OPENAI_API_KEY=...
GENAI_CALLING_TIMEOUT_MS=120000

See docs/CONFIGURATION.md for all options, or copy .env.example to .env.local.

Quickstart

CLI (fastest, unified API, agent-friendly)

# List available models by capabilities (out=text/image/audio/video/embedding)
uv run genai model available --all

# Text generation
uv run genai --model openai:gpt-4o-mini --prompt "Hello"

# Image understanding (image -> text)
uv run genai --model openai:gpt-4o-mini --prompt "Describe this image" --image-path ./examples/demo_image.png

# Image generation (text -> image file)
uv run genai --model openai:gpt-image-1 --prompt "A red cube on white background, minimal" --output-path ./out.png

# Speech-to-text (audio -> text)
uv run genai --model openai:whisper-1 --audio-path ./examples/demo_tts.mp3

# Text-to-speech (text -> audio file)
uv run genai --model openai:tts-1 --prompt "Hello from genai-calling" --output-path ./out.mp3

# Video generation (text -> video; async style)
uv run genai --model openai:sora-2 --prompt "A paper boat sailing on a rain puddle, cinematic" --no-wait
# ...later
uv run genai --model openai:sora-2 --job-id "<job_id>" --output-path ./out.mp4 --timeout-ms 600000

SDK: Text generation

from gravtice import Client, GenerateRequest, Message, OutputSpec, Part

client = Client()
resp = client.generate(
    GenerateRequest(
        model="openai:gpt-4o-mini",
        input=[Message(role="user", content=[Part.from_text("Hello!")])],
        output=OutputSpec(modalities=["text"]),
    )
)
print(resp.output[0].content[0].text)

SDK: Streaming

import sys
from gravtice import Client, GenerateRequest, Message, OutputSpec, Part

client = Client()
req = GenerateRequest(
    model="openai:gpt-4o-mini",
    input=[Message(role="user", content=[Part.from_text("Tell me a joke")])],
    output=OutputSpec(modalities=["text"]),
)
for ev in client.generate_stream(req):
    if ev.type == "output.text.delta":
        sys.stdout.write(str(ev.data.get("delta", "")))
        sys.stdout.flush()
print()

SDK: Image understanding

from gravtice import Client, GenerateRequest, Message, OutputSpec, Part, PartSourcePath
from gravtice import detect_mime_type

path = "./cat.png"
mime = detect_mime_type(path) or "application/octet-stream"

client = Client()
resp = client.generate(
    GenerateRequest(
        model="openai:gpt-4o-mini",
        input=[
            Message(
                role="user",
                content=[
                    Part.from_text("Describe this image"),
                    Part(type="image", mime_type=mime, source=PartSourcePath(path=path)),
                ],
            )
        ],
        output=OutputSpec(modalities=["text"]),
    )
)
print(resp.output[0].content[0].text)

SDK: List available models

from gravtice import Client

client = Client()
print(client.list_all_available_models())

Providers

Provider Notes
openai GPT-4, DALL·E, Whisper, TTS
google Gemini, Imagen, Veo
anthropic Claude
aliyun DashScope / Bailian (OpenAI-compatible + AIGC)
volcengine Ark / Doubao (OpenAI-compatible)
tuzi-web / tuzi-openai / tuzi-google / tuzi-anthropic Tuzi adapters

Binary output

Binary Part.source is a tagged union:

  • Input: bytes/path/base64/url/ref (MCP forbids bytes/path)
  • Output: url/base64/ref (SDK does not auto-download to disk)

If you need to write to file, see examples/demo.py (_write_binary()), or reuse Client.download_to_file() for the built-in safe downloader.

CLI & MCP Server

# CLI
uv run genai --model openai:gpt-4o-mini --prompt "Hello"
uv run genai model available --all

# Tuzi Chirp music
uv run genai --model tuzi-web:chirp-v3-5 --prompt "Lo-fi hiphop beat, 30s" --no-wait
# ...later
uv run genai --model tuzi-web:chirp-v3-5 --job-id "<job_id>" --output-path demo_suno.mp3 --timeout-ms 600000

# MCP Server
uv run genai-mcp-server                    # Streamable HTTP: /mcp, SSE: /sse
uv run genai-mcp-cli tools                 # Debug CLI

Security

  • SSRF protection: rejects private/loopback URLs by default (GENAI_CALLING_ALLOW_PRIVATE_URLS=1 to allow)
  • DNS pinning: mitigates DNS rebinding
  • Download limit: 128MiB per URL by default (GENAI_CALLING_URL_DOWNLOAD_MAX_BYTES)
  • Bearer token auth: for MCP server
  • Token rules: fine-grained access control

Testing

uv run python -m pytest tests/ -v

Docs

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

genai_calling-0.1.6.tar.gz (110.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

genai_calling-0.1.6-py3-none-any.whl (104.9 kB view details)

Uploaded Python 3

File details

Details for the file genai_calling-0.1.6.tar.gz.

File metadata

  • Download URL: genai_calling-0.1.6.tar.gz
  • Upload date:
  • Size: 110.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for genai_calling-0.1.6.tar.gz
Algorithm Hash digest
SHA256 3cc51cb401b535f045d6b88b8b33ef12c7ba92f04475cc7232614c16645df2f8
MD5 c699357355ba2281d33d5c2a0f625a6e
BLAKE2b-256 8008868615295163ed9b2ce90e876a9c717f1b1d591b0a7656d8a552a6c380db

See more details on using hashes here.

Provenance

The following attestation bundles were made for genai_calling-0.1.6.tar.gz:

Publisher: publish.yml on gravtice/genai-calling

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file genai_calling-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: genai_calling-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 104.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for genai_calling-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 4ed7aa57f731b0077d42d3adcd7d2d53cfce01a3c7b4aa55d3c4be2f208d9b80
MD5 12df7febf3f617082b102ad81fc6b48e
BLAKE2b-256 11ebe292b5d3f0a62733e86c431ac748278ef024a832045ef76db0ac16be09e9

See more details on using hashes here.

Provenance

The following attestation bundles were made for genai_calling-0.1.6-py3-none-any.whl:

Publisher: publish.yml on gravtice/genai-calling

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page