Skip to main content

Single-endpoint GenAI SDK (multi-provider, multimodal)

Project description

genai-calling

CI Python License

中文文档:README_ZH.md

One interface for calling multimodal models; four ways to use: Skill, MCP, CLI, SDK.

Features

  • Multi-provider: OpenAI, Google (Gemini), Anthropic (Claude), Aliyun (DashScope/Bailian), Volcengine (Doubao/Ark), Tuzi
  • Multimodal: text/image/audio/video input and output (model-dependent)
  • Unified API: a single Client.generate() for all providers
  • Streaming: generate_stream() for incremental output
  • Tool calling: function tools (model/provider-dependent)
  • JSON Schema output: structured output (model/provider-dependent)
  • MCP Server: Streamable HTTP and SSE transport
  • Security: SSRF protection, DNS pinning, download limits, Bearer token auth (MCP)

Installation

pip install genai-calling

Python imports should use gravtice.genai.

For development:

pip install -e .
# or (recommended)
uv sync --group dev

Configuration (Env Vars, Zero-parameter)

Configuration is managed via environment variables.

You can set env vars in two ways:

  1. Runtime env vars (inline or exported in shell)
  2. Env files (.env.local, .env.production, .env.development, .env.test) and the global fallback ~/.genai-calling/.env

Runtime example (inline):

OPENAI_API_KEY=... uv run genai --model openai:gpt-4o-mini --prompt "Hello"

When env files are used, SDK/CLI/MCP loads them automatically with priority (high -> low):

.env.local > .env.production > .env.development > .env.test > ~/.genai-calling/.env

Process env vars override both project and global env files (the loader uses os.environ.setdefault()).

Use ~/.genai-calling/.env for user-wide shared defaults such as API keys. Keep worktree-specific settings such as ports in project-local .env.local.

Minimal .env.local (OpenAI only):

OPENAI_API_KEY=...
GENAI_CALLING_TIMEOUT_MS=120000

See docs/CONFIGURATION.md for all options, or copy .env.example to .env.local.

Quickstart

CLI (fastest, unified API, agent-friendly)

# List available models by capabilities (out=text/image/audio/video/embedding)
uv run genai model available --all

# Text generation
uv run genai --model openai:gpt-4o-mini --prompt "Hello"

# Image understanding (image -> text)
uv run genai --model openai:gpt-4o-mini --prompt "Describe this image" --image-path ./examples/demo_image.png

# Image generation (text -> image file)
uv run genai --model openai:gpt-image-1 --prompt "A red cube on white background, minimal" --output-path ./out.png

# Speech-to-text (audio -> text)
uv run genai --model openai:whisper-1 --audio-path ./examples/demo_tts.mp3

# Text-to-speech (text -> audio file)
uv run genai --model openai:tts-1 --prompt "Hello from genai-calling" --output-path ./out.mp3

# Video generation (text -> video; async style)
uv run genai --model openai:sora-2 --prompt "A paper boat sailing on a rain puddle, cinematic" --no-wait
# ...later
uv run genai --model openai:sora-2 --job-id "<job_id>" --output-path ./out.mp4 --timeout-ms 600000

SDK: Text generation

from gravtice.genai import Client, GenerateRequest, Message, OutputSpec, Part

client = Client()
resp = client.generate(
    GenerateRequest(
        model="openai:gpt-4o-mini",
        input=[Message(role="user", content=[Part.from_text("Hello!")])],
        output=OutputSpec(modalities=["text"]),
    )
)
print(resp.output[0].content[0].text)

SDK: Streaming

import sys
from gravtice.genai import Client, GenerateRequest, Message, OutputSpec, Part

client = Client()
req = GenerateRequest(
    model="openai:gpt-4o-mini",
    input=[Message(role="user", content=[Part.from_text("Tell me a joke")])],
    output=OutputSpec(modalities=["text"]),
)
for ev in client.generate_stream(req):
    if ev.type == "output.text.delta":
        sys.stdout.write(str(ev.data.get("delta", "")))
        sys.stdout.flush()
print()

SDK: Image understanding

from gravtice.genai import (
    Client,
    GenerateRequest,
    Message,
    OutputSpec,
    Part,
    PartSourcePath,
    detect_mime_type,
)

path = "./cat.png"
mime = detect_mime_type(path) or "application/octet-stream"

client = Client()
resp = client.generate(
    GenerateRequest(
        model="openai:gpt-4o-mini",
        input=[
            Message(
                role="user",
                content=[
                    Part.from_text("Describe this image"),
                    Part(type="image", mime_type=mime, source=PartSourcePath(path=path)),
                ],
            )
        ],
        output=OutputSpec(modalities=["text"]),
    )
)
print(resp.output[0].content[0].text)

SDK: List available models

from gravtice.genai import Client

client = Client()
print(client.list_all_available_models())

Providers

Provider Notes
openai GPT-4, DALL·E, Whisper, TTS
google Gemini, Imagen, Veo
anthropic Claude
aliyun DashScope / Bailian (OpenAI-compatible + AIGC)
volcengine Ark / Doubao (OpenAI-compatible)
tuzi-web / tuzi-openai / tuzi-google / tuzi-anthropic Tuzi adapters

Binary output

Binary Part.source is a tagged union:

  • Input: bytes/path/base64/url/ref (MCP forbids bytes/path)
  • Output: url/base64/ref (SDK does not auto-download to disk)

If you need to write to file, see examples/demo.py (_write_binary()), or reuse Client.download_to_file() for the built-in safe downloader.

CLI & MCP Server

# CLI
uv run genai --model openai:gpt-4o-mini --prompt "Hello"
uv run genai model available --all

# Tuzi Chirp music
uv run genai --model tuzi-web:chirp-v3-5 --prompt "Lo-fi hiphop beat, 30s" --no-wait
# ...later
uv run genai --model tuzi-web:chirp-v3-5 --job-id "<job_id>" --output-path demo_suno.mp3 --timeout-ms 600000

# MCP Server
uv run genai-mcp-server                    # Streamable HTTP: /mcp, SSE: /sse
uv run genai-mcp-cli tools                 # Debug CLI

Security

  • SSRF protection: rejects private/loopback URLs by default (GENAI_CALLING_ALLOW_PRIVATE_URLS=1 to allow)
  • DNS pinning: mitigates DNS rebinding
  • Download limit: 128MiB per URL by default (GENAI_CALLING_URL_DOWNLOAD_MAX_BYTES)
  • Bearer token auth: for MCP server
  • Token rules: fine-grained access control

Testing

uv run python -m pytest tests/ -v

Docs

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

genai_calling-0.1.7.tar.gz (110.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

genai_calling-0.1.7-py3-none-any.whl (104.6 kB view details)

Uploaded Python 3

File details

Details for the file genai_calling-0.1.7.tar.gz.

File metadata

  • Download URL: genai_calling-0.1.7.tar.gz
  • Upload date:
  • Size: 110.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for genai_calling-0.1.7.tar.gz
Algorithm Hash digest
SHA256 7ec714dda79849015231015037a48494ff201121cc9d17e82e5e75cab553b69c
MD5 7e92eef6cce55db542aa85fed41e4b27
BLAKE2b-256 bf4326dd2fac6f9c9506e54915ea5fbac4a711f462f0b5c66fd7f547945c4410

See more details on using hashes here.

Provenance

The following attestation bundles were made for genai_calling-0.1.7.tar.gz:

Publisher: publish.yml on gravtice/genai-calling

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file genai_calling-0.1.7-py3-none-any.whl.

File metadata

  • Download URL: genai_calling-0.1.7-py3-none-any.whl
  • Upload date:
  • Size: 104.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for genai_calling-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 ed6548be9cbc39c7511f088c38fd4588c46c7bb4a50c801c532e8075bf0816a0
MD5 78df851e964bf37359c8e9abc673a74d
BLAKE2b-256 516b5007f18dd1f40ee61d8f018ca5fb3ac6ab091490240a012d8857d6139099

See more details on using hashes here.

Provenance

The following attestation bundles were made for genai_calling-0.1.7-py3-none-any.whl:

Publisher: publish.yml on gravtice/genai-calling

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page