Unified access layer for AI completions (LLM), embedding (semantic), image, video, and voice services
Project description
ai-api-unified 2.5.0
ai-api-unified is a unified Python library for AI completions, embeddings, image generation, video generation, and voice. Application code targets stable base interfaces and factory entry points while concrete providers are selected at runtime from environment configuration.
Author: Dave Thomas
Install name: ai-api-unified
Import path: ai_api_unified
Python: >=3.11,<3.14
License: MIT
The current package architecture is registry-backed and lazy-loaded:
- provider SDKs are optional extras, not base dependencies
- providers are resolved only when a factory selects them
- package
__init__modules export stable interfaces only - missing provider selectors are configuration errors, not implicit fallbacks
Overview
Use this library when you want one consistent interface across multiple AI providers without binding application code to a single SDK. The library currently covers:
- text completions
- embeddings
- image generation
- video generation
- text-to-speech and selected speech-to-text flows
The public entry points are the stable base interfaces and factories:
AIFactory.get_ai_completions_client()AIFactory.get_ai_embedding_client()AIFactory.get_ai_images_client()AIFactory.get_ai_video_client()AIVoiceFactory.create()
Capabilities
| Capability | Stable interface | Engines | Required extra(s) |
|---|---|---|---|
| Completions | AIBaseCompletions |
openai, google-gemini, Bedrock-routed aliases such as nova, anthropic, llama, mistral, cohere, ai21, rerank |
openai, google_gemini, bedrock |
| Embeddings | AIBaseEmbeddings |
openai, titan, google-gemini |
openai, bedrock, google_gemini |
| Images | AIBaseImages |
openai, google-gemini, nova-canvas and Bedrock image aliases |
openai, google_gemini, bedrock |
| Videos | AIBaseVideos |
openai, google-gemini, nova-reel and Bedrock video aliases |
openai, google_gemini, bedrock |
| Voice TTS | AIVoiceBase |
openai, google, azure, elevenlabs |
openai, google_gemini, azure_tts, elevenlabs |
| Voice STT | AIVoiceBase |
provider-specific support such as Google and ElevenLabs | google_gemini, elevenlabs |
Default model guidance in the checked-in OSS env files:
- Google completions:
gemini-2.5-flash - Google embeddings:
gemini-embedding-001 - Google images:
imagen-4.0-generate-001 - Google videos:
veo-3.1-lite-generate-preview - Google voice:
gemini-2.5-pro-tts
Installation
Python Requirements
This package requires Python >=3.11,<3.14.
Install as a Dependency
Base package only:
poetry add ai-api-unified
Install with one or more provider extras:
poetry add 'ai-api-unified[google_gemini]'
poetry add 'ai-api-unified[openai]'
poetry add 'ai-api-unified[bedrock,google_gemini]'
poetry add 'ai-api-unified[google_gemini,video_frames]'
poetry add 'ai-api-unified[openai,video_frames]'
Install in a Local Clone
Base install:
poetry install
Common local development installs:
poetry install --with dev
poetry install --extras "google_gemini"
poetry install --extras "openai"
poetry install --extras "google_gemini" --extras "video_frames" --with dev
poetry install --extras "openai" --extras "video_frames" --with dev
poetry install --all-extras --with dev
Optional Extras
| Extra | Installs |
|---|---|
openai |
OpenAI completions, embeddings, images, and voice |
google_gemini |
Google Gemini completions, embeddings, images, and Google voice |
bedrock |
AWS Bedrock completions, Titan embeddings, Bedrock image providers, and Bedrock video providers |
video_frames |
Optional frame extraction helpers backed by ImageIO + Pillow |
azure_tts |
Azure Cognitive Services TTS |
elevenlabs |
ElevenLabs TTS and STT |
middleware-pii-redaction |
Presidio + spaCy + usaddress; install the required spaCy model separately |
middleware-pii-redaction-small |
Compatibility alias for PII redaction deps; pair with separate en_core_web_sm install |
middleware-pii-redaction-large |
Compatibility alias for PII redaction deps; pair with separate en_core_web_lg install |
similarity_score |
NumPy-based similarity helpers |
dev |
Optional dev dependencies from [project.optional-dependencies] |
Environment File
Copy env_template to .env and fill in only the providers you use.
The OSS template now defaults to Google API-key auth:
COMPLETIONS_ENGINE=google-gemini
EMBEDDING_ENGINE=google-gemini
IMAGE_ENGINE=google-gemini
VIDEO_ENGINE=google-gemini
AI_VOICE_ENGINE=google
GOOGLE_GEMINI_API_KEY=...
GOOGLE_AUTH_METHOD=api_key
COMPLETIONS_MODEL_NAME=gemini-2.5-flash
EMBEDDING_MODEL_NAME=gemini-embedding-001
IMAGE_MODEL_NAME=imagen-4.0-generate-001
VIDEO_MODEL_NAME=veo-3.1-lite-generate-preview
DEFAULT_GEMINI_TTS_MODEL=gemini-2.5-pro-tts
Leave EMBEDDING_DIMENSIONS unset unless you deliberately want a provider-specific override. The library now preserves provider defaults instead of forcing a generic value.
Smoke Test
python -c "import ai_api_unified; print(ai_api_unified.__version__)"
Quickstart
The examples below assume the Google API-key-first OSS defaults shown above. The same APIs work with OpenAI, Bedrock, Azure, or ElevenLabs by changing env selectors and installing the matching extras.
Completions
from ai_api_unified import AIFactory, AIBaseCompletions
client: AIBaseCompletions = AIFactory.get_ai_completions_client()
response: str = client.send_prompt("Say hello in one short sentence.")
print(response)
Embeddings
from ai_api_unified import AIFactory, AIBaseEmbeddings
client: AIBaseEmbeddings = AIFactory.get_ai_embedding_client()
result: dict[str, object] = client.generate_embeddings("hello world")
embedding = result.get("embedding")
print(len(embedding) if embedding else None)
Image Generation
from ai_api_unified import AIFactory, AIBaseImageProperties, AIBaseImages
client: AIBaseImages = AIFactory.get_ai_images_client()
images: list[bytes] = client.generate_images(
"A watercolor skyline at sunrise.",
AIBaseImageProperties(width=1024, height=1024, format="png", num_images=1),
)
with open("generated_image.png", "wb") as generated_file:
generated_file.write(images[0])
Video Generation
The blocking convenience path is:
from pathlib import Path
from ai_api_unified import AIFactory, AIBaseVideoProperties, AIBaseVideos
client: AIBaseVideos = AIFactory.get_ai_video_client()
result = client.generate_video(
"A cinematic tracking shot of a neon train crossing the desert at dusk.",
AIBaseVideoProperties(output_dir=Path("./generated_videos")),
)
video_bytes: bytes = result.artifacts[0].read_bytes()
frames: list[bytes] = AIBaseVideos.extract_image_frames_from_video_buffer(
video_bytes,
time_offsets_seconds=[0.0, 1.0],
)
AIBaseVideos.save_image_buffers_as_files(
frames,
output_dir=Path("./generated_frames"),
)
If you want explicit job control instead of the blocking wrapper:
from ai_api_unified import AIFactory, AIBaseVideos
client: AIBaseVideos = AIFactory.get_ai_video_client()
job = client.submit_video_generation(
"A stop-motion paper city waking up at sunrise."
)
job = client.wait_for_video_generation(job)
result = client.download_video_result(job)
print(result.job.status, result.artifacts[0].file_path)
Kick Off Google Gemini Video Generation
Environment:
VIDEO_ENGINE=google-gemini
VIDEO_MODEL_NAME=veo-3.1-lite-generate-preview
GOOGLE_GEMINI_API_KEY=...
GOOGLE_AUTH_METHOD=api_key
Code:
from pathlib import Path
from ai_api_unified import AIFactory, AIBaseVideoProperties, AIBaseVideos
client: AIBaseVideos = AIFactory.get_ai_video_client()
result = client.generate_video(
"A cinematic dolly shot of a red vintage train moving through a desert at sunset.",
AIBaseVideoProperties(
output_dir=Path("./generated_videos/google"),
timeout_seconds=1200,
poll_interval_seconds=10,
),
)
print(result.artifacts[0].file_path)
Kick Off OpenAI Video Generation
Environment:
VIDEO_ENGINE=openai
VIDEO_MODEL_NAME=sora-2
OPENAI_API_KEY=...
Code:
from pathlib import Path
from ai_api_unified import AIFactory, AIBaseVideoProperties, AIBaseVideos
client: AIBaseVideos = AIFactory.get_ai_video_client()
result = client.generate_video(
"A wide cinematic shot of gentle ocean waves meeting a rocky coastline at golden hour.",
AIBaseVideoProperties(
output_dir=Path("./generated_videos/openai"),
timeout_seconds=1200,
poll_interval_seconds=10,
),
)
print(result.artifacts[0].file_path)
Frame extraction requires the optional video_frames extra.
Voice
from ai_api_unified import AIVoiceBase, AIVoiceFactory
voice: AIVoiceBase = AIVoiceFactory.create()
audio_bytes: bytes = voice.text_to_speech("Hello from ai-api-unified")
with open("out.wav", "wb") as output_file:
output_file.write(audio_bytes)
Configuration
Required Engine Selectors
There is no implicit default provider. Set the selector for each capability you use.
| Environment variable | Valid values |
|---|---|
COMPLETIONS_ENGINE |
openai, google-gemini, Bedrock-routed aliases such as nova, anthropic, llama, mistral, cohere, ai21, rerank |
EMBEDDING_ENGINE |
openai, titan, google-gemini |
IMAGE_ENGINE |
openai, google-gemini, nova-canvas, bedrock, nova |
VIDEO_ENGINE |
openai, google-gemini, bedrock, nova, nova-reel |
AI_VOICE_ENGINE |
openai, google, azure, elevenlabs |
Common Model Settings
| Environment variable | Notes |
|---|---|
COMPLETIONS_MODEL_NAME |
Optional completions model override |
EMBEDDING_MODEL_NAME |
Optional embeddings model override |
IMAGE_MODEL_NAME |
Optional image model override |
VIDEO_MODEL_NAME |
Optional video model override |
VIDEO_OUTPUT_DIR |
Optional local output directory for materialized video artifacts |
VIDEO_POLL_INTERVAL_SECONDS |
Optional default poll interval for video job waits |
VIDEO_TIMEOUT_SECONDS |
Optional default timeout for blocking video generation |
BEDROCK_VIDEO_OUTPUT_S3_URI |
Required for Nova Reel unless provided per request |
DEFAULT_GEMINI_TTS_MODEL |
Optional Google voice model override |
EMBEDDING_DIMENSIONS |
Optional embeddings dimension override. Leave unset for provider defaults. |
AI_API_GEO_RESIDENCY |
Optional geo hint. US, USA, or United States normalize to US routing where supported. |
AI_MIDDLEWARE_CONFIG_PATH |
Optional YAML config path for observability and PII middleware |
Provider Authentication
OpenAI
Required:
OPENAI_API_KEY
Common optional settings:
OPENAI_BASE_URLCOMPLETIONS_MODEL_NAMEEMBEDDING_MODEL_NAMEIMAGE_MODEL_NAMEVIDEO_MODEL_NAMEEMBEDDING_DIMENSIONSAI_API_GEO_RESIDENCY
AWS Bedrock and Titan
Required:
AWS_REGION- standard AWS credentials in environment or runtime IAM context
Common optional settings:
COMPLETIONS_MODEL_NAMEEMBEDDING_MODEL_NAMEIMAGE_MODEL_NAMEVIDEO_MODEL_NAMEBEDROCK_VIDEO_OUTPUT_S3_URIEMBEDDING_DIMENSIONSAI_API_GEO_RESIDENCY
For current Bedrock model IDs, use the AWS documentation:
Google-backed Providers
Required for the default OSS path:
GOOGLE_GEMINI_API_KEY
Default auth mode:
GOOGLE_AUTH_METHOD=api_key
Optional service-account mode:
GOOGLE_AUTH_METHOD=service_accountGOOGLE_APPLICATION_CREDENTIALS=/path/to/service_account.jsonGOOGLE_PROJECT_ID=<gcp-project-id>GOOGLE_LOCATION=us-central1
This auth policy applies across Google completions, embeddings, images, videos, and voice.
Azure TTS
Required:
MICROSOFT_COGNITIVE_SERVICES_API_KEYMICROSOFT_COGNITIVE_SERVICES_REGION
Optional:
MICROSOFT_COGNITIVE_SERVICES_ENDPOINTAI_VOICE_LANGUAGE
ElevenLabs
Required:
ELEVEN_LABS_API_KEY
Lazy Loading and Imports
Prefer the stable interfaces and factories exported from the root package:
from ai_api_unified import (
AIFactory,
AIVoiceFactory,
AIBaseCompletions,
AIBaseEmbeddings,
AIBaseImages,
AIBaseVideos,
AIVoiceBase,
)
Factory entry points:
AIFactory.get_ai_completions_client()AIFactory.get_ai_embedding_client()AIFactory.get_ai_images_client()AIFactory.get_ai_video_client()AIVoiceFactory.create()
Concrete providers are no longer re-exported from package __init__.py modules. If you need a concrete class directly, import it from its implementation module, for example:
from ai_api_unified.completions.ai_google_gemini_completions import (
GoogleGeminiCompletions,
)
Typical factory failure modes:
- unsupported engine selector:
ValueError - selected provider extra is not installed:
AiProviderDependencyUnavailableError - provider load/runtime failure:
AiProviderRuntimeError
Middleware
Middleware is configured by YAML referenced through AI_MIDDLEWARE_CONFIG_PATH.
Observability
Observability middleware is built into the base package and does not require a separate extra.
Features:
- metadata-only input, output, and error events
- request-scoped correlation via
set_observability_context(...) - coverage for completions, embeddings, images, videos, and text-to-speech
Minimal config:
middleware:
- name: 'observability'
enabled: true
See docs/observability_middleware_example.yaml and docs/observability_middleware_design.md.
PII Redaction
PII redaction is optional and requires one of:
middleware-pii-redactionmiddleware-pii-redaction-smallmiddleware-pii-redaction-large
Minimal config:
middleware:
- name: 'pii_redaction'
enabled: true
settings:
direction: 'input_only'
Notes:
strict_mode: trueenables fail-closed behaviorbalanced,high_accuracy, andlow_memorydetection profiles are supported- install a matching spaCy model separately, for example
poetry run python -m spacy download en_core_web_smforbalancedorpoetry run python -m spacy download en_core_web_lgforhigh_accuracy middleware-pii-redaction-smallandmiddleware-pii-redaction-largeare compatibility aliases for the same Python dependency set; the spaCy model is still installed as a separate build/runtime asset- for no-egress images and Lambda-style deployments, install the spaCy model wheel into the build artifact instead of relying on runtime downloads
- recognizer customization is configured in YAML, not hard-coded in provider implementations
See docs/pii_redaction_design.md for the fuller contract and deployment tradeoffs.
Structured Responses
Use AIStructuredPrompt together with strict_schema_prompt(...) when you want schema-validated structured output.
from copy import deepcopy
from typing import Any
from ai_api_unified import (
AIFactory,
AIStructuredPrompt,
StructuredResponseTokenLimitError,
)
class ContactExtraction(AIStructuredPrompt):
name: str | None = None
city: str | None = None
@staticmethod
def get_prompt() -> str:
return "Extract the person's name and city from: Alice lives in Paris."
@classmethod
def model_json_schema(cls) -> dict[str, Any]:
schema = deepcopy(super().model_json_schema())
schema["properties"] = {
"name": {"type": "string"},
"city": {"type": "string"},
}
schema["required"] = ["name", "city"]
return schema
client = AIFactory.get_ai_completions_client()
try:
result = client.strict_schema_prompt(
prompt=ContactExtraction.get_prompt(),
response_model=ContactExtraction,
max_response_tokens=2048,
)
print(result.name, result.city)
except StructuredResponseTokenLimitError as exc:
print(exc)
Key behavior:
- structured prompts are provider-agnostic at the call site
- the library validates the response against your output schema
- undersized or truncated structured responses raise
StructuredResponseTokenLimitError
AIStructuredPrompt.send_structured_prompt(...) is also available when you want the prompt to live on the model instance itself.
Testing
Regular test run:
poetry run pytest -m "not nonmock"
Live provider tests:
poetry run pytest -m nonmock -s -vv
Notes for live tests:
- install the matching provider extras first
- configure credentials in
.env - some tests may skip when a provider account lacks quota, a paid image tier, or an enabled cloud service
Release and PyPI Publishing
The OSS repository publishes to public PyPI only.
Before publishing:
- Bump the version in
pyproject.toml. - Bump the version in
src/ai_api_unified/__version__.py. - Ensure the working tree is clean.
- Ensure your PyPI token is configured for Poetry.
Publish with the checked-in script:
./publish.sh
The script:
- checks for uncommitted changes
- confirms the version
- removes old build artifacts
- builds the wheel and sdist locally
- fails before upload if built metadata contains direct URL requirements that PyPI rejects
- runs
poetry publishonly after metadata validation passes
After publishing, tag and push the release:
git tag v<version>
git push origin v<version>
Troubleshooting
COMPLETIONS_ENGINE must be configured explicitlyor similar: set the required engine selector for that capability.AiProviderDependencyUnavailableError: install the extra for the selected provider.- Google auth errors: the OSS default is
GOOGLE_AUTH_METHOD=api_key. If you switch toservice_account, make sureGOOGLE_APPLICATION_CREDENTIALSpoints to a valid local JSON credential file and setGOOGLE_PROJECT_IDandGOOGLE_LOCATIONwhen required. - Unexpected embeddings dimensions: leave
EMBEDDING_DIMENSIONSunset unless you intentionally want a non-default size. - Google image generation or TTS failures in live tests can reflect account/service state rather than library bugs, for example a disabled cloud API or missing paid-plan access.
AI_API_GEO_RESIDENCY=USis a best-effort routing hint. Only providers that expose regional routing controls can honor it directly.
License
This project is released under the MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ai_api_unified-2.5.3.tar.gz.
File metadata
- Download URL: ai_api_unified-2.5.3.tar.gz
- Upload date:
- Size: 165.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.4 CPython/3.13.6 Darwin/25.4.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e20f0d1ec2c8a9f4d4e8881cc1bdb7aa4fbe8851d108884133c0edf8fb2e5678
|
|
| MD5 |
2f42017174909afbed64056f1d043316
|
|
| BLAKE2b-256 |
826abd967876bec6b6c3a8cc50d01ec7f0e807ed6040c4fc4d8bd10124ff4ed1
|
File details
Details for the file ai_api_unified-2.5.3-py3-none-any.whl.
File metadata
- Download URL: ai_api_unified-2.5.3-py3-none-any.whl
- Upload date:
- Size: 199.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.4 CPython/3.13.6 Darwin/25.4.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5b7832c48cbdd8f8b1f5004ead949400758c325366830b4a6b4ca27ebde5b69a
|
|
| MD5 |
c9b30cd810544dae8c33553d3587e87c
|
|
| BLAKE2b-256 |
c000b81909307af9b321d5e4e9e16f8c09a452b34ea373664c7051ba8b95f53b
|