Skip to main content

多模型统一接口 SDK - 一套 API 调用所有 AI 模型

Project description

🤖 AGN-SDK

Stars Forks License Python Async

Unified API | 5+ Providers | Async-First | Production-Ready | Type-Safe


🌐 Language / 语言

English | 中文


A unified SDK that calls all AI models through one API — whether it's text chat, image generation, video creation, or speech synthesis.

Built with async-first design, full type safety, and a pluggable adapter architecture. If you're familiar with the OpenAI API, you can use AGN-SDK immediately.


✨ Features

Capabilities

Capability Description Status
💬 Chat Completion Multi-turn conversations with AI models ✅ Stable
🖼️ Image Generation Text-to-image generation ✅ Stable
🎬 Video Creation Async video generation with polling ✅ Stable
🔊 Speech Synthesis Text-to-speech generation (Edge TTS / ElevenLabs / Cartesia) ✅ Stable
🎤 Speech Recognition Audio transcription (Deepgram / AssemblyAI) ✅ Stable
📊 Embeddings Text embedding vectors (OpenAI / Gemini / Agnes / aggregators) ✅ Stable

Architecture Highlights

  • Unified Interface — One API to rule all AI providers (OpenAI, Azure, Anthropic, Gemini, etc.)
  • Async-First Design — Full async/await support, built on httpx and anyio
  • Adapter Pattern — Add new providers by implementing a single adapter class
  • Type Safety — All data models defined with Pydantic v2, full type hints throughout
  • Production-Ready — Built-in retry logic, error mapping, parameter normalization
  • OpenAI Compatible — Use OpenAI API patterns directly, minimal learning curve

📦 Supported Providers

V1.0 (Stable)

Provider Chat Image Video Base URL
Agnes AI https://api.agnes.ai/v1
OpenAI https://api.openai.com/v1
Azure OpenAI Azure endpoint

V1.1+ (Coming Soon)

Provider Chat Image Video
Anthropic (Claude)
Google Gemini
Runway
Pika
Stability AI
ByteDance Seedance

Audio & Embedding Providers (Stable)

Provider TTS ASR Embed Notes
Edge TTS Free, no API key, 100+ neural voices
ElevenLabs High-quality multilingual voices
Cartesia Ultra-low-latency Sonic TTS
Deepgram Nova-2/Nova-3, fastest ASR
AssemblyAI Enterprise ASR with diarization
OpenAI text-embedding-3-small/large
Google Gemini gemini-embedding-001
Agnes AI Unified embeddings
SiliconFlow / Together / Fireworks / Cloudflare Aggregator-hosted embedding models

📦 Project Structure

agn-sdk/
├── agn/                              # SDK core code
│   ├── __init__.py                   # SDK entry point
│   ├── client.py                     # Unified client (API layer)
│   ├── router.py                     # Router (routing layer)
│   ├── adapters/                     # Adapter implementations
│   │   ├── base.py                   # BaseAdapter abstract class
│   │   ├── factory.py                # Adapter factory
│   │   ├── agnes.py                  # Agnes AI adapter
│   │   ├── openai.py                 # OpenAI adapter
│   │   ├── azure.py                  # Azure OpenAI adapter
│   │   └── ...                       # More adapters
│   ├── core/                         # Core utilities
│   │   ├── http_client.py            # Async HTTP client
│   │   ├── retry.py                  # Retry mechanism
│   │   ├── errors.py                 # Error definitions
│   │   ├── config.py                 # Configuration
│   │   └── utils.py                  # Utilities
│   └── models/                       # Pydantic data models
│       ├── common.py                 # Common structures
│       ├── chat.py                   # Chat models
│       ├── image.py                  # Image models
│       ├── video.py                  # Video models
│       └── options.py                # Unified options
├── docs/                             # Documentation
│   ├── 01-overview.md                # Project overview
│   ├── 02-architecture.md            # Architecture design
│   └── 03-api-reference.md           # API reference
├── tests/                            # Test suite
├── examples/                         # Usage examples
├── pyproject.toml                    # Project config
└── README.md                         # Project docs (English)

🚀 Quick Start

Get started in 3 steps:

Step 1: Install

# From PyPI (coming soon)
pip install agn-sdk

# Or install from source (development mode)
git clone https://github.com/your-org/agn-sdk.git
cd agn-sdk
pip install -e .

Step 2: Configure API Key

# Option A — Environment variable (Recommended)
export AGN_API_KEY='your-api-key'
export AGN_BASE_URL='https://api.agnes.ai/v1'  # Provider-specific base URL

# Option B — .env file (auto-loaded)
echo "AGN_API_KEY=your-api-key" > .env
echo "AGN_BASE_URL=https://api.agnes.ai/v1" >> .env

# Option C — Pass via code
client = Client(provider="agnes", api_key="your-key", base_url="https://api.agnes.ai/v1")

Step 3: Call AI Models

import asyncio
from agn import Client

async def main():
    # Create client
    client = Client(
        provider="agnes",
        api_key="your-api-key",
        base_url="https://api.agnes.ai/v1",
    )
    
    # 💬 Chat Completion
    response = await client.chat(
        model="claude-3-opus",
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Hello!"}
        ],
        temperature=0.7,
    )
    print(response.choices[0].message.content)
    
    # 🖼️ Image Generation
    result = await client.image_generate(
        model="dall-e-3",
        prompt="A beautiful sunset over the ocean",
        size="1024x1024",
        quality="hd",
    )
    print(result.data[0].url)
    
    # 🎬 Video Creation (async with polling)
    task = await client.video_create(
        model="video-gen-1",
        prompt="A cat walking in the garden",
        width=1280,
        height=720,
        num_frames=121,
    )
    
    # Poll video status until complete
    while True:
        status = await client.video_poll(task.task_id)
        print(f"Status: {status.status}, Progress: {status.progress}%")
        if status.status in ("completed", "failed"):
            break
    
    print(f"Video URL: {status.video_url}")

if __name__ == "__main__":
    asyncio.run(main())

That's it! You now have a unified interface to all supported AI providers.


📖 Complete Usage Reference

Chat Completion

response = await client.chat(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain quantum computing in simple terms."}
    ],
    temperature=0.7,        # Randomness (0.0-2.0)
    max_tokens=1000,        # Max response tokens
    top_p=1.0,              # Nucleus sampling
    frequency_penalty=0.0,   # Repetition penalty
    presence_penalty=0.0,    # Topic diversity
    stream=False,            # Streaming response
)
print(response.choices[0].message.content)

Image Generation

result = await client.image_generate(
    model="dall-e-3",
    prompt="A futuristic city with flying cars",
    size="1024x1024",       # 1024x1024, 1024x1792, 1792x1024
    quality="hd",           # standard or hd
    style="vivid",          # vivid or natural (DALL-E 3)
    n=1,                    # Number of images
)
print(result.data[0].url)   # or result.data[0].b64_json

Video Creation

# Create video task
task = await client.video_create(
    model="video-gen-1",
    prompt="A dramatic sword fight scene",
    width=1280,
    height=720,
    num_frames=121,         # Must satisfy 8n+1 (e.g., 33, 49, 81, 121, 241)
    frame_rate=24,
    seed=42,                # Optional: for reproducibility
)
print(f"Task ID: {task.task_id}")

# Poll until complete
status = await client.video_poll(task.task_id)
while status.status == "in_progress":
    await asyncio.sleep(5)
    status = await client.video_poll(task.task_id)
    
print(f"Video URL: {status.video_url}")

Speech Synthesis (TTS)

# Edge TTS — free, no API key required (install: pip install agn-sdk[edge-tts])
edge_client = Client(provider="edge-tts", api_key="")
result = await edge_client.speech(
    model="edge-tts",
    input="Hello, this is synthesized speech.",
    voice="xiaoxiao",          # short name or full ID: zh-CN-XiaoxiaoNeural
    response_format="mp3",     # mp3 / wav / ogg / pcm
    rate="+10%",               # optional: speed adjustment
)
with open("out.mp3", "wb") as f:
    f.write(result.audio_data)

# OpenAI TTS — uses alloy/echo/nova voices
result = await client.speech(
    model="tts-1",
    input="The quick brown fox jumps over the lazy dog.",
    voice="alloy",
    response_format="mp3",
    speed=1.0,
)

Speech Recognition (ASR)

# Deepgram Nova-2 (fastest) — accepts file path / URL / bytes / base64
result = await client.transcribe(
    model="nova-2",
    file="./meeting.wav",
    language="zh",             # optional: auto-detected if omitted
    smart_format=True,         # optional: punctuation + number formatting
)
print(result.text)
for seg in result.segments or []:
    print(f"[{seg.start:.2f}-{seg.end:.2f}] {seg.text}")

# AssemblyAI — enterprise ASR with speaker diarization
result = await client.transcribe(
    model="best",
    file="./interview.mp3",
    speaker_labels=True,
    sentiment_analysis=True,
)

Text Embeddings

# Single text or batch — returns unified EmbeddingResult
result = await client.embed(
    model="text-embedding-3-small",
    input=["hello world", "machine learning"],
)
vectors = result.get_embeddings()   # list[list[float]]
print(len(vectors), len(vectors[0]))

🏗️ Architecture Overview

┌─────────────────────────────────────────────────────────┐
│                    API Layer (Client)                   │
│            chat() / image_generate() / video_create()   │
└─────────────────────────┬───────────────────────────────┘
                          │
                          ▼
┌─────────────────────────────────────────────────────────┐
│                  Router Layer                           │
│          Model routing, load balancing, fallback        │
└─────────────────────────┬───────────────────────────────┘
                          │
                          ▼
┌─────────────────────────────────────────────────────────┐
│                 Adapter Layer                           │
│    BaseAdapter → AgnesAdapter / OpenAIAdapter / ...     │
└─────────────────────────┬───────────────────────────────┘
                          │
                          ▼
┌─────────────────────────────────────────────────────────┐
│                   Core Layer                            │
│      HTTP client, retry, errors, config, utils          │
└─────────────────────────────────────────────────────────┘
  • API Layer — Unified Client class, user-facing interface
  • Router Layer — Model selection, routing, load balancing
  • Adapter Layer — Provider-specific implementations, parameter mapping, response normalization
  • Core Layer — Shared utilities (HTTP, retry, errors, config)

📋 Adapter Development

Adding a new AI provider is straightforward:

  1. Create adapter — Inherit BaseAdapter, implement required methods
  2. Register factory — Call AdapterFactory.register("provider_name", YourAdapter)
  3. Declare capabilities — Set supported_capabilities list
from agn.adapters.base import BaseAdapter
from agn.adapters.factory import AdapterFactory

class NewProviderAdapter(BaseAdapter):
    provider_type = "newprovider"
    provider_name = "New Provider"
    supported_capabilities = [Capabilities.CHAT, Capabilities.IMAGE_GENERATE]
    
    async def start(self) -> None:
        # Initialize HTTP client
        ...
    
    async def chat(self, model: str, messages: list[ChatMessage], **kwargs):
        # Implement chat logic
        ...
    
    # ... implement other methods

AdapterFactory.register("newprovider", NewProviderAdapter)

🧪 Development

# Clone and setup
git clone https://github.com/your-org/agn-sdk.git
cd agn-sdk
python -m venv venv
source venv/bin/activate

# Install with dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Code formatting
black agn/

# Linting
ruff check agn/

# Type checking
mypy agn/

📜 License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agn_sdk-1.0.0.tar.gz (319.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agn_sdk-1.0.0-py3-none-any.whl (138.7 kB view details)

Uploaded Python 3

File details

Details for the file agn_sdk-1.0.0.tar.gz.

File metadata

  • Download URL: agn_sdk-1.0.0.tar.gz
  • Upload date:
  • Size: 319.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-requests/2.34.2

File hashes

Hashes for agn_sdk-1.0.0.tar.gz
Algorithm Hash digest
SHA256 3dea7731705f2528eec65c4a584bda67d3a7fb5f08c2db7875ee7e4c388c6ab9
MD5 2e5787abc526895d5ff6050256da4b31
BLAKE2b-256 d2e3fcbf6a01feb841972a33578e97b60ceb8f69c93130b4c09f0bba8d58aae7

See more details on using hashes here.

File details

Details for the file agn_sdk-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: agn_sdk-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 138.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-requests/2.34.2

File hashes

Hashes for agn_sdk-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f03048e9838d8b3cce881efaa4270e472f60310f7b5d40b76cb029c52f081327
MD5 a52eec0ed0e78b667b3f1d2d92bd8c82
BLAKE2b-256 07d3ce7f782254f059cd1491aa7e0a80e343e42000f3e1bea8aad8ef7114bc76

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page