多模型统一接口 SDK - 一套 API 调用所有 AI 模型

These details have not been verified by PyPI

Project links

Project description

🤖 AGN-SDK

Python Async

Unified API | 5+ Providers | Async-First | Production-Ready | Type-Safe

🌐 Language / 语言

A unified SDK that calls all AI models through one API — whether it's text chat, image generation, video creation, or speech synthesis.

Built with async-first design, full type safety, and a pluggable adapter architecture. If you're familiar with the OpenAI API, you can use AGN-SDK immediately.

✨ Features

Capabilities

Capability	Description	Status
💬 Chat Completion	Multi-turn conversations with AI models	✅ Stable
🖼️ Image Generation	Text-to-image generation	✅ Stable
🎬 Video Creation	Async video generation with polling	✅ Stable
🔊 Speech Synthesis	Text-to-speech generation (Edge TTS / ElevenLabs / Cartesia)	✅ Stable
🎤 Speech Recognition	Audio transcription (Deepgram / AssemblyAI)	✅ Stable
📊 Embeddings	Text embedding vectors (OpenAI / Gemini / Agnes / aggregators)	✅ Stable

Architecture Highlights

Unified Interface — One API to rule all AI providers (OpenAI, Azure, Anthropic, Gemini, etc.)
Async-First Design — Full async/await support, built on httpx and anyio
Adapter Pattern — Add new providers by implementing a single adapter class
Type Safety — All data models defined with Pydantic v2, full type hints throughout
Production-Ready — Built-in retry logic, error mapping, parameter normalization
OpenAI Compatible — Use OpenAI API patterns directly, minimal learning curve

📦 Supported Providers

V1.0 (Stable)

Provider	Chat	Image	Video	Base URL
Agnes AI	✅	✅	✅	`https://api.agnes.ai/v1`
OpenAI	✅	✅	—	`https://api.openai.com/v1`
Azure OpenAI	✅	✅	—	Azure endpoint

V1.1+ (Coming Soon)

Provider	Chat	Image	Video
Anthropic (Claude)	✅	—	—
Google Gemini	✅	✅	—
Runway	—	—	✅
Pika	—	—	✅
Stability AI	—	✅	—
ByteDance Seedance	✅	✅	✅

Audio & Embedding Providers (Stable)

Provider	TTS	ASR	Embed	Notes
Edge TTS	✅	—	—	Free, no API key, 100+ neural voices
ElevenLabs	✅	—	—	High-quality multilingual voices
Cartesia	✅	—	—	Ultra-low-latency Sonic TTS
Deepgram	—	✅	—	Nova-2/Nova-3, fastest ASR
AssemblyAI	—	✅	—	Enterprise ASR with diarization
OpenAI	—	—	✅	text-embedding-3-small/large
Google Gemini	—	—	✅	gemini-embedding-001
Agnes AI	—	—	✅	Unified embeddings
SiliconFlow / Together / Fireworks / Cloudflare	—	—	✅	Aggregator-hosted embedding models

📦 Project Structure

agn-sdk/
├── agn/                              # SDK core code
│   ├── __init__.py                   # SDK entry point
│   ├── client.py                     # Unified client (API layer)
│   ├── router.py                     # Router (routing layer)
│   ├── adapters/                     # Adapter implementations
│   │   ├── base.py                   # BaseAdapter abstract class
│   │   ├── factory.py                # Adapter factory
│   │   ├── agnes.py                  # Agnes AI adapter
│   │   ├── openai.py                 # OpenAI adapter
│   │   ├── azure.py                  # Azure OpenAI adapter
│   │   └── ...                       # More adapters
│   ├── core/                         # Core utilities
│   │   ├── http_client.py            # Async HTTP client
│   │   ├── retry.py                  # Retry mechanism
│   │   ├── errors.py                 # Error definitions
│   │   ├── config.py                 # Configuration
│   │   └── utils.py                  # Utilities
│   └── models/                       # Pydantic data models
│       ├── common.py                 # Common structures
│       ├── chat.py                   # Chat models
│       ├── image.py                  # Image models
│       ├── video.py                  # Video models
│       └── options.py                # Unified options
├── docs/                             # Documentation
│   ├── 01-overview.md                # Project overview
│   ├── 02-architecture.md            # Architecture design
│   └── 03-api-reference.md           # API reference
├── tests/                            # Test suite
├── examples/                         # Usage examples
├── pyproject.toml                    # Project config
└── README.md                         # Project docs (English)

🚀 Quick Start

Get started in 3 steps:

Step 1: Install

# From PyPI (coming soon)
pip install agn-sdk

# Or install from source (development mode)
git clone https://github.com/your-org/agn-sdk.git
cd agn-sdk
pip install -e .

Step 2: Configure API Key

# Option A — Environment variable (Recommended)
export AGN_API_KEY='your-api-key'
export AGN_BASE_URL='https://api.agnes.ai/v1'  # Provider-specific base URL

# Option B — .env file (auto-loaded)
echo "AGN_API_KEY=your-api-key" > .env
echo "AGN_BASE_URL=https://api.agnes.ai/v1" >> .env

# Option C — Pass via code
client = Client(provider="agnes", api_key="your-key", base_url="https://api.agnes.ai/v1")

Step 3: Call AI Models

import asyncio
from agn import Client

async def main():
    # Create client
    client = Client(
        provider="agnes",
        api_key="your-api-key",
        base_url="https://api.agnes.ai/v1",
    )
    
    # 💬 Chat Completion
    response = await client.chat(
        model="claude-3-opus",
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Hello!"}
        ],
        temperature=0.7,
    )
    print(response.choices[0].message.content)
    
    # 🖼️ Image Generation
    result = await client.image_generate(
        model="dall-e-3",
        prompt="A beautiful sunset over the ocean",
        size="1024x1024",
        quality="hd",
    )
    print(result.data[0].url)
    
    # 🎬 Video Creation (async with polling)
    task = await client.video_create(
        model="video-gen-1",
        prompt="A cat walking in the garden",
        width=1280,
        height=720,
        num_frames=121,
    )
    
    # Poll video status until complete
    while True:
        status = await client.video_poll(task.task_id)
        print(f"Status: {status.status}, Progress: {status.progress}%")
        if status.status in ("completed", "failed"):
            break
    
    print(f"Video URL: {status.video_url}")

if __name__ == "__main__":
    asyncio.run(main())

✨ That's it! You now have a unified interface to all supported AI providers.

📖 Complete Usage Reference

Chat Completion

response = await client.chat(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain quantum computing in simple terms."}
    ],
    temperature=0.7,        # Randomness (0.0-2.0)
    max_tokens=1000,        # Max response tokens
    top_p=1.0,              # Nucleus sampling
    frequency_penalty=0.0,   # Repetition penalty
    presence_penalty=0.0,    # Topic diversity
    stream=False,            # Streaming response
)
print(response.choices[0].message.content)

Image Generation

result = await client.image_generate(
    model="dall-e-3",
    prompt="A futuristic city with flying cars",
    size="1024x1024",       # 1024x1024, 1024x1792, 1792x1024
    quality="hd",           # standard or hd
    style="vivid",          # vivid or natural (DALL-E 3)
    n=1,                    # Number of images
)
print(result.data[0].url)   # or result.data[0].b64_json

Video Creation

# Create video task
task = await client.video_create(
    model="video-gen-1",
    prompt="A dramatic sword fight scene",
    width=1280,
    height=720,
    num_frames=121,         # Must satisfy 8n+1 (e.g., 33, 49, 81, 121, 241)
    frame_rate=24,
    seed=42,                # Optional: for reproducibility
)
print(f"Task ID: {task.task_id}")

# Poll until complete
status = await client.video_poll(task.task_id)
while status.status == "in_progress":
    await asyncio.sleep(5)
    status = await client.video_poll(task.task_id)
    
print(f"Video URL: {status.video_url}")

Speech Synthesis (TTS)

# Edge TTS — free, no API key required (install: pip install agn-sdk[edge-tts])
edge_client = Client(provider="edge-tts", api_key="")
result = await edge_client.speech(
    model="edge-tts",
    input="Hello, this is synthesized speech.",
    voice="xiaoxiao",          # short name or full ID: zh-CN-XiaoxiaoNeural
    response_format="mp3",     # mp3 / wav / ogg / pcm
    rate="+10%",               # optional: speed adjustment
)
with open("out.mp3", "wb") as f:
    f.write(result.audio_data)

# OpenAI TTS — uses alloy/echo/nova voices
result = await client.speech(
    model="tts-1",
    input="The quick brown fox jumps over the lazy dog.",
    voice="alloy",
    response_format="mp3",
    speed=1.0,
)

Speech Recognition (ASR)

# Deepgram Nova-2 (fastest) — accepts file path / URL / bytes / base64
result = await client.transcribe(
    model="nova-2",
    file="./meeting.wav",
    language="zh",             # optional: auto-detected if omitted
    smart_format=True,         # optional: punctuation + number formatting
)
print(result.text)
for seg in result.segments or []:
    print(f"[{seg.start:.2f}-{seg.end:.2f}] {seg.text}")

# AssemblyAI — enterprise ASR with speaker diarization
result = await client.transcribe(
    model="best",
    file="./interview.mp3",
    speaker_labels=True,
    sentiment_analysis=True,
)

Text Embeddings

# Single text or batch — returns unified EmbeddingResult
result = await client.embed(
    model="text-embedding-3-small",
    input=["hello world", "machine learning"],
)
vectors = result.get_embeddings()   # list[list[float]]
print(len(vectors), len(vectors[0]))

🏗️ Architecture Overview

┌─────────────────────────────────────────────────────────┐
│                    API Layer (Client)                   │
│            chat() / image_generate() / video_create()   │
└─────────────────────────┬───────────────────────────────┘
                          │
                          ▼
┌─────────────────────────────────────────────────────────┐
│                  Router Layer                           │
│          Model routing, load balancing, fallback        │
└─────────────────────────┬───────────────────────────────┘
                          │
                          ▼
┌─────────────────────────────────────────────────────────┐
│                 Adapter Layer                           │
│    BaseAdapter → AgnesAdapter / OpenAIAdapter / ...     │
└─────────────────────────┬───────────────────────────────┘
                          │
                          ▼
┌─────────────────────────────────────────────────────────┐
│                   Core Layer                            │
│      HTTP client, retry, errors, config, utils          │
└─────────────────────────────────────────────────────────┘

API Layer — Unified Client class, user-facing interface
Router Layer — Model selection, routing, load balancing
Adapter Layer — Provider-specific implementations, parameter mapping, response normalization
Core Layer — Shared utilities (HTTP, retry, errors, config)

📋 Adapter Development

Adding a new AI provider is straightforward:

Create adapter — Inherit BaseAdapter, implement required methods
Register factory — Call AdapterFactory.register("provider_name", YourAdapter)
Declare capabilities — Set supported_capabilities list

from agn.adapters.base import BaseAdapter
from agn.adapters.factory import AdapterFactory

class NewProviderAdapter(BaseAdapter):
    provider_type = "newprovider"
    provider_name = "New Provider"
    supported_capabilities = [Capabilities.CHAT, Capabilities.IMAGE_GENERATE]
    
    async def start(self) -> None:
        # Initialize HTTP client
        ...
    
    async def chat(self, model: str, messages: list[ChatMessage], **kwargs):
        # Implement chat logic
        ...
    
    # ... implement other methods

AdapterFactory.register("newprovider", NewProviderAdapter)

🧪 Development

# Clone and setup
git clone https://github.com/your-org/agn-sdk.git
cd agn-sdk
python -m venv venv
source venv/bin/activate

# Install with dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Code formatting
black agn/

# Linting
ruff check agn/

# Type checking
mypy agn/

📜 License

MIT License

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.3.3

Jun 27, 2026

1.3.2

Jun 27, 2026

1.3.1

Jun 27, 2026

1.3.0

Jun 27, 2026

1.1.1

Jun 26, 2026

1.1.0

Jun 26, 2026

1.0.0

Jun 25, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agn_sdk-1.3.3.tar.gz (198.6 kB view details)

Uploaded Jun 27, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agn_sdk-1.3.3-py3-none-any.whl (143.3 kB view details)

Uploaded Jun 27, 2026 Python 3

File details

Details for the file agn_sdk-1.3.3.tar.gz.

File metadata

Download URL: agn_sdk-1.3.3.tar.gz
Upload date: Jun 27, 2026
Size: 198.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for agn_sdk-1.3.3.tar.gz
Algorithm	Hash digest
SHA256	`ead85af680cd99300249a304efb46a82b05d762dd0181079489193f040baa896`
MD5	`be3917228d2c274e94a22b456b7a31bf`
BLAKE2b-256	`db2e14431b778e3eb34b806c7745e75248dad777f4029a0159708937519bdb30`

See more details on using hashes here.

File details

Details for the file agn_sdk-1.3.3-py3-none-any.whl.

File metadata

Download URL: agn_sdk-1.3.3-py3-none-any.whl
Upload date: Jun 27, 2026
Size: 143.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for agn_sdk-1.3.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3bf7353d969291eae94ccf1f7085429d40524b2f8435d514284b9b0678335616`
MD5	`9e7da43578b0f01371b24f394fe94c0e`
BLAKE2b-256	`61fdc55b113b3e213f94493435d167a77c00c59e136574e94d08ea83b48c82a1`

See more details on using hashes here.

agn-sdk 1.3.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

🤖 AGN-SDK

✨ Features

Capabilities

Architecture Highlights

📦 Supported Providers

V1.0 (Stable)

V1.1+ (Coming Soon)

Audio & Embedding Providers (Stable)

📦 Project Structure

🚀 Quick Start

Step 1: Install

Step 2: Configure API Key

Step 3: Call AI Models

📖 Complete Usage Reference

Chat Completion

Image Generation

Video Creation

Speech Synthesis (TTS)

Speech Recognition (ASR)

Text Embeddings

🏗️ Architecture Overview

📋 Adapter Development

🧪 Development

📜 License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes