多模型统一接口 SDK - 一套 API 调用所有 AI 模型
Project description
🤖 AGN-SDK
Unified API | 5+ Providers | Async-First | Production-Ready | Type-Safe
A unified SDK that calls all AI models through one API — whether it's text chat, image generation, video creation, or speech synthesis.
Built with async-first design, full type safety, and a pluggable adapter architecture. If you're familiar with the OpenAI API, you can use AGN-SDK immediately.
✨ Features
Capabilities
| Capability | Description | Status |
|---|---|---|
| 💬 Chat Completion | Multi-turn conversations with AI models | ✅ Stable |
| 🖼️ Image Generation | Text-to-image generation | ✅ Stable |
| 🎬 Video Creation | Async video generation with polling | ✅ Stable |
| 🔊 Speech Synthesis | Text-to-speech generation (Edge TTS / ElevenLabs / Cartesia) | ✅ Stable |
| 🎤 Speech Recognition | Audio transcription (Deepgram / AssemblyAI) | ✅ Stable |
| 📊 Embeddings | Text embedding vectors (OpenAI / Gemini / Agnes / aggregators) | ✅ Stable |
Architecture Highlights
- Unified Interface — One API to rule all AI providers (OpenAI, Azure, Anthropic, Gemini, etc.)
- Async-First Design — Full async/await support, built on
httpxandanyio - Adapter Pattern — Add new providers by implementing a single adapter class
- Type Safety — All data models defined with Pydantic v2, full type hints throughout
- Production-Ready — Built-in retry logic, error mapping, parameter normalization
- OpenAI Compatible — Use OpenAI API patterns directly, minimal learning curve
📦 Supported Providers
V1.0 (Stable)
| Provider | Chat | Image | Video | Base URL |
|---|---|---|---|---|
| Agnes AI | ✅ | ✅ | ✅ | https://api.agnes.ai/v1 |
| OpenAI | ✅ | ✅ | — | https://api.openai.com/v1 |
| Azure OpenAI | ✅ | ✅ | — | Azure endpoint |
V1.1+ (Coming Soon)
| Provider | Chat | Image | Video |
|---|---|---|---|
| Anthropic (Claude) | ✅ | — | — |
| Google Gemini | ✅ | ✅ | — |
| Runway | — | — | ✅ |
| Pika | — | — | ✅ |
| Stability AI | — | ✅ | — |
| ByteDance Seedance | ✅ | ✅ | ✅ |
Audio & Embedding Providers (Stable)
| Provider | TTS | ASR | Embed | Notes |
|---|---|---|---|---|
| Edge TTS | ✅ | — | — | Free, no API key, 100+ neural voices |
| ElevenLabs | ✅ | — | — | High-quality multilingual voices |
| Cartesia | ✅ | — | — | Ultra-low-latency Sonic TTS |
| Deepgram | — | ✅ | — | Nova-2/Nova-3, fastest ASR |
| AssemblyAI | — | ✅ | — | Enterprise ASR with diarization |
| OpenAI | — | — | ✅ | text-embedding-3-small/large |
| Google Gemini | — | — | ✅ | gemini-embedding-001 |
| Agnes AI | — | — | ✅ | Unified embeddings |
| SiliconFlow / Together / Fireworks / Cloudflare | — | — | ✅ | Aggregator-hosted embedding models |
📦 Project Structure
agn-sdk/
├── agn/ # SDK core code
│ ├── __init__.py # SDK entry point
│ ├── client.py # Unified client (API layer)
│ ├── router.py # Router (routing layer)
│ ├── adapters/ # Adapter implementations
│ │ ├── base.py # BaseAdapter abstract class
│ │ ├── factory.py # Adapter factory
│ │ ├── agnes.py # Agnes AI adapter
│ │ ├── openai.py # OpenAI adapter
│ │ ├── azure.py # Azure OpenAI adapter
│ │ └── ... # More adapters
│ ├── core/ # Core utilities
│ │ ├── http_client.py # Async HTTP client
│ │ ├── retry.py # Retry mechanism
│ │ ├── errors.py # Error definitions
│ │ ├── config.py # Configuration
│ │ └── utils.py # Utilities
│ └── models/ # Pydantic data models
│ ├── common.py # Common structures
│ ├── chat.py # Chat models
│ ├── image.py # Image models
│ ├── video.py # Video models
│ └── options.py # Unified options
├── docs/ # Documentation
│ ├── 01-overview.md # Project overview
│ ├── 02-architecture.md # Architecture design
│ └── 03-api-reference.md # API reference
├── tests/ # Test suite
├── examples/ # Usage examples
├── pyproject.toml # Project config
└── README.md # Project docs (English)
🚀 Quick Start
Get started in 3 steps:
Step 1: Install
# From PyPI (coming soon)
pip install agn-sdk
# Or install from source (development mode)
git clone https://github.com/your-org/agn-sdk.git
cd agn-sdk
pip install -e .
Step 2: Configure API Key
# Option A — Environment variable (Recommended)
export AGN_API_KEY='your-api-key'
export AGN_BASE_URL='https://api.agnes.ai/v1' # Provider-specific base URL
# Option B — .env file (auto-loaded)
echo "AGN_API_KEY=your-api-key" > .env
echo "AGN_BASE_URL=https://api.agnes.ai/v1" >> .env
# Option C — Pass via code
client = Client(provider="agnes", api_key="your-key", base_url="https://api.agnes.ai/v1")
Step 3: Call AI Models
import asyncio
from agn import Client
async def main():
# Create client
client = Client(
provider="agnes",
api_key="your-api-key",
base_url="https://api.agnes.ai/v1",
)
# 💬 Chat Completion
response = await client.chat(
model="claude-3-opus",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
],
temperature=0.7,
)
print(response.choices[0].message.content)
# 🖼️ Image Generation
result = await client.image_generate(
model="dall-e-3",
prompt="A beautiful sunset over the ocean",
size="1024x1024",
quality="hd",
)
print(result.data[0].url)
# 🎬 Video Creation (async with polling)
task = await client.video_create(
model="video-gen-1",
prompt="A cat walking in the garden",
width=1280,
height=720,
num_frames=121,
)
# Poll video status until complete
while True:
status = await client.video_poll(task.task_id)
print(f"Status: {status.status}, Progress: {status.progress}%")
if status.status in ("completed", "failed"):
break
print(f"Video URL: {status.video_url}")
if __name__ == "__main__":
asyncio.run(main())
✨ That's it! You now have a unified interface to all supported AI providers.
📖 Complete Usage Reference
Chat Completion
response = await client.chat(
model="gpt-4o",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain quantum computing in simple terms."}
],
temperature=0.7, # Randomness (0.0-2.0)
max_tokens=1000, # Max response tokens
top_p=1.0, # Nucleus sampling
frequency_penalty=0.0, # Repetition penalty
presence_penalty=0.0, # Topic diversity
stream=False, # Streaming response
)
print(response.choices[0].message.content)
Image Generation
result = await client.image_generate(
model="dall-e-3",
prompt="A futuristic city with flying cars",
size="1024x1024", # 1024x1024, 1024x1792, 1792x1024
quality="hd", # standard or hd
style="vivid", # vivid or natural (DALL-E 3)
n=1, # Number of images
)
print(result.data[0].url) # or result.data[0].b64_json
Video Creation
# Create video task
task = await client.video_create(
model="video-gen-1",
prompt="A dramatic sword fight scene",
width=1280,
height=720,
num_frames=121, # Must satisfy 8n+1 (e.g., 33, 49, 81, 121, 241)
frame_rate=24,
seed=42, # Optional: for reproducibility
)
print(f"Task ID: {task.task_id}")
# Poll until complete
status = await client.video_poll(task.task_id)
while status.status == "in_progress":
await asyncio.sleep(5)
status = await client.video_poll(task.task_id)
print(f"Video URL: {status.video_url}")
Speech Synthesis (TTS)
# Edge TTS — free, no API key required (install: pip install agn-sdk[edge-tts])
edge_client = Client(provider="edge-tts", api_key="")
result = await edge_client.speech(
model="edge-tts",
input="Hello, this is synthesized speech.",
voice="xiaoxiao", # short name or full ID: zh-CN-XiaoxiaoNeural
response_format="mp3", # mp3 / wav / ogg / pcm
rate="+10%", # optional: speed adjustment
)
with open("out.mp3", "wb") as f:
f.write(result.audio_data)
# OpenAI TTS — uses alloy/echo/nova voices
result = await client.speech(
model="tts-1",
input="The quick brown fox jumps over the lazy dog.",
voice="alloy",
response_format="mp3",
speed=1.0,
)
Speech Recognition (ASR)
# Deepgram Nova-2 (fastest) — accepts file path / URL / bytes / base64
result = await client.transcribe(
model="nova-2",
file="./meeting.wav",
language="zh", # optional: auto-detected if omitted
smart_format=True, # optional: punctuation + number formatting
)
print(result.text)
for seg in result.segments or []:
print(f"[{seg.start:.2f}-{seg.end:.2f}] {seg.text}")
# AssemblyAI — enterprise ASR with speaker diarization
result = await client.transcribe(
model="best",
file="./interview.mp3",
speaker_labels=True,
sentiment_analysis=True,
)
Text Embeddings
# Single text or batch — returns unified EmbeddingResult
result = await client.embed(
model="text-embedding-3-small",
input=["hello world", "machine learning"],
)
vectors = result.get_embeddings() # list[list[float]]
print(len(vectors), len(vectors[0]))
🏗️ Architecture Overview
┌─────────────────────────────────────────────────────────┐
│ API Layer (Client) │
│ chat() / image_generate() / video_create() │
└─────────────────────────┬───────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ Router Layer │
│ Model routing, load balancing, fallback │
└─────────────────────────┬───────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ Adapter Layer │
│ BaseAdapter → AgnesAdapter / OpenAIAdapter / ... │
└─────────────────────────┬───────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ Core Layer │
│ HTTP client, retry, errors, config, utils │
└─────────────────────────────────────────────────────────┘
- API Layer — Unified
Clientclass, user-facing interface - Router Layer — Model selection, routing, load balancing
- Adapter Layer — Provider-specific implementations, parameter mapping, response normalization
- Core Layer — Shared utilities (HTTP, retry, errors, config)
📋 Adapter Development
Adding a new AI provider is straightforward:
- Create adapter — Inherit
BaseAdapter, implement required methods - Register factory — Call
AdapterFactory.register("provider_name", YourAdapter) - Declare capabilities — Set
supported_capabilitieslist
from agn.adapters.base import BaseAdapter
from agn.adapters.factory import AdapterFactory
class NewProviderAdapter(BaseAdapter):
provider_type = "newprovider"
provider_name = "New Provider"
supported_capabilities = [Capabilities.CHAT, Capabilities.IMAGE_GENERATE]
async def start(self) -> None:
# Initialize HTTP client
...
async def chat(self, model: str, messages: list[ChatMessage], **kwargs):
# Implement chat logic
...
# ... implement other methods
AdapterFactory.register("newprovider", NewProviderAdapter)
🧪 Development
# Clone and setup
git clone https://github.com/your-org/agn-sdk.git
cd agn-sdk
python -m venv venv
source venv/bin/activate
# Install with dev dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Code formatting
black agn/
# Linting
ruff check agn/
# Type checking
mypy agn/
📜 License
MIT License
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agn_sdk-1.3.3.tar.gz.
File metadata
- Download URL: agn_sdk-1.3.3.tar.gz
- Upload date:
- Size: 198.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ead85af680cd99300249a304efb46a82b05d762dd0181079489193f040baa896
|
|
| MD5 |
be3917228d2c274e94a22b456b7a31bf
|
|
| BLAKE2b-256 |
db2e14431b778e3eb34b806c7745e75248dad777f4029a0159708937519bdb30
|
File details
Details for the file agn_sdk-1.3.3-py3-none-any.whl.
File metadata
- Download URL: agn_sdk-1.3.3-py3-none-any.whl
- Upload date:
- Size: 143.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3bf7353d969291eae94ccf1f7085429d40524b2f8435d514284b9b0678335616
|
|
| MD5 |
9e7da43578b0f01371b24f394fe94c0e
|
|
| BLAKE2b-256 |
61fdc55b113b3e213f94493435d167a77c00c59e136574e94d08ea83b48c82a1
|