Unified multimodal multi-provider Python toolkit for text, audio, image, and video generation.

These details have not been verified by PyPI

Project links

Project description

easy-ai-clients

easy-ai-clients is a typed, multimodal Python library that wraps multiple AI providers behind a single public API for:

Text generation — chat, instructions, reasoning, tool use, batch, optional image inputs (vision)
Speech transcription — audio to text with word timings and speaker diarization
Speech synthesis — text to spoken audio
Music generation — text prompt to instrumental audio
Image generation — text to image
Image transformation — prompt + input image to new image
Image composition — prompt + base image + reference image to a new image
Image editing — inpainting with optional mask
Video generation — text/image/audio to video
Lip sync — animate a face image with an audio track

All operations are available in synchronous and asynchronous variants. Provider names are resolved through a flexible alias system, so "gemini", "google", and "google-ai" all map to the same adapter.

Requirements

Python 3.12 or higher
Dependencies installed automatically: httpx, pydantic, Pillow

Install

pip install easy-ai-clients

For local development (includes test and lint tools):

pip install -e ".[dev]"

Checking the version

import easy_ai_clients

print(easy_ai_clients.__version__)

Configuration

Credentials are resolved lazily, in this order:

Explicit credentials={...} values passed to a call or to EasyAiClient(...).
Environment variables from the current process.

easy-ai-clients does not auto-load .env files, keeping imports side-effect free.

Environment variable setup

cp .env.example .env
# Edit .env and fill only the providers you plan to use
export OPENAI_API_KEY="sk-..."
export GOOGLE_API_KEY="..."

If your application uses python-dotenv, load it in the application entrypoint — not inside the library:

from dotenv import load_dotenv

load_dotenv()

See docs/configuration.md for full details and per-call credential override patterns.

Quickstart

Text generation

from easy_ai_clients.text import generate

result = generate(
    provider="openai",
    instructions="Write a short slogan about rain in one sentence.",
    credentials={"OPENAI_API_KEY": "sk-..."},
)

print(result.text)
print(result.cost_usd)

Text generation with image input (vision)

Pass one or more images via input_images to describe them, transcribe text in the image, or answer questions grounded in visual content. Accepts local paths, raw bytes, or base64 strings.

from easy_ai_clients.text import generate

result = generate(
    provider="openai",
    instructions="Describe the image in one concise paragraph.",
    input_images=["/path/to/photo.jpg"],
    credentials={"OPENAI_API_KEY": "sk-..."},
)

print(result.text)

Vision-capable providers include openai, google, anthropic, and any provider whose selected model supports image inputs.

Speech transcription

from easy_ai_clients.audio import transcribe

result = transcribe(
    provider="deepgram",
    audio="/path/to/recording.mp3",
    credentials={"DEEPGRAM_API_KEY": "..."},
)

print(result.text)

for word in result.words:
    print(word.text, word.start_seconds, word.end_seconds)

for segment in result.speaker_segments:
    print(segment.speaker, segment.text)

Speech synthesis

from easy_ai_clients.audio import synthesize

result = synthesize(
    provider="elevenlabs",
    text="Hello, this is a test of text-to-speech synthesis.",
    credentials={"ELEVENLABS_API_KEY": "..."},
)

import base64
audio_bytes = base64.b64decode(result.audio_base64)
with open("output.mp3", "wb") as f:
    f.write(audio_bytes)

Music generation

from easy_ai_clients.audio import compose

result = compose(
    provider="stability",
    prompt="Upbeat lo-fi hip-hop, 80 BPM, mellow piano and drums",
    duration_seconds=30,
    credentials={"STABILITY_API_KEY": "..."},
)

import base64
audio_bytes = base64.b64decode(result.audio_base64)
with open("music.mp3", "wb") as f:
    f.write(audio_bytes)

Image generation

from easy_ai_clients.image import generate

result = generate(
    provider="openai",
    prompt="A cinematic lighthouse in a storm, dramatic lighting",
    credentials={"OPENAI_API_KEY": "sk-..."},
)

import base64
image_bytes = base64.b64decode(result.image_base64)
with open("output.png", "wb") as f:
    f.write(image_bytes)

Image transformation

Takes an existing image and transforms it according to a prompt.

from easy_ai_clients.image import transform

result = transform(
    provider="stability",
    prompt="Turn this photo into a watercolor painting",
    image="/path/to/photo.jpg",
    strength=0.75,
    credentials={"STABILITY_API_KEY": "..."},
)

import base64
image_bytes = base64.b64decode(result.image_base64)
with open("transformed.png", "wb") as f:
    f.write(image_bytes)

Image composition

Combines a base image with a reference image using a prompt. Use it to apply the style, pose, or scene of the reference image to the subject of the base image. Typical prompts: "render the person of image 1 in the pose of image 2", "image 1 in the style of image 2", "place the subject of image 1 in the scene of image 2".

from easy_ai_clients.image import compose

result = compose(
    provider="google",
    prompt="Render the person of image 1 in the painting style of image 2",
    image="/path/to/subject.jpg",
    reference_image="/path/to/style_reference.jpg",
    credentials={"GOOGLE_API_KEY": "..."},
)

import base64
image_bytes = base64.b64decode(result.image_base64)
with open("composed.png", "wb") as f:
    f.write(image_bytes)

Image editing

Edits specific regions of an image using a prompt. Pass a mask to limit edits to the masked area.

from easy_ai_clients.image import edit

result = edit(
    provider="openai",
    prompt="Replace the sky with a dramatic sunset",
    image="/path/to/photo.png",
    mask="/path/to/mask.png",  # black pixels = edited (inpainted), white pixels = preserved
    credentials={"OPENAI_API_KEY": "sk-..."},
)

import base64
image_bytes = base64.b64decode(result.image_base64)
with open("edited.png", "wb") as f:
    f.write(image_bytes)

Video generation

Video files are written directly to a local path. Long-running jobs are polled until completion.

from easy_ai_clients.video import generate

result = generate(
    provider="runway",
    prompt="A slow pan across a misty mountain valley at dawn",
    output_path="output.mp4",
    credentials={"RUNWAYML_API_SECRET": "..."},
)

print(result.output_path)  # pathlib.Path to the written file

To generate a video driven by both a reference image and an audio track:

from easy_ai_clients.video import generate

result = generate(
    provider="heygen",
    image="/path/to/avatar.jpg",
    audio="/path/to/voiceover.mp3",
    output_path="avatar_video.mp4",
    credentials={"HEYGEN_API_KEY": "..."},
)

Lip sync

Animate a still image or avatar portrait using a supplied audio track.

from easy_ai_clients.video import lipsync

result = lipsync(
    provider="heygen",
    image="/path/to/face.jpg",
    audio="/path/to/speech.mp3",
    output_path="lipsync.mp4",
    credentials={"HEYGEN_API_KEY": "..."},
)

print(result.output_path)

Stateful client

Use EasyAiClient to share credentials and default settings across many calls:

from easy_ai_clients import EasyAiClient

client = EasyAiClient(
    credentials={"OPENAI_API_KEY": "sk-..."},
    timeout_seconds=90,
    job_timeout_seconds=900,
    max_retries=4,
)

text_result = client.text.generate(
    provider="openai",
    instructions="Summarize this in 3 bullets.",
    context={"topic": "multimodal AI libraries"},
)

image_result = client.image.generate(
    provider="openai",
    prompt="Abstract geometric art in blue and gold",
)

composed = client.image.compose(
    provider="google",
    prompt="Put the person of image 1 in the scene of image 2",
    image="/path/to/subject.jpg",
    reference_image="/path/to/scene.jpg",
)

Async example

Every helper has an _async variant. Combine them with asyncio.gather for concurrent requests:

import asyncio

from easy_ai_clients.image import generate_async
from easy_ai_clients.text import generate_async as text_generate_async


async def main() -> None:
    text_task = text_generate_async(
        provider="openai",
        instructions="Write a one-sentence product description for a smart umbrella.",
        credentials={"OPENAI_API_KEY": "sk-..."},
    )
    image_task = generate_async(
        provider="openai",
        prompt="A smart umbrella with solar panels and an LED display",
        credentials={"OPENAI_API_KEY": "sk-..."},
    )

    text_result, image_result = await asyncio.gather(text_task, image_task)
    print(text_result.text)
    print(image_result.image_base64[:32])


asyncio.run(main())

Public API

`easy_ai_clients.text`

from easy_ai_clients.text import batch_generate, batch_generate_async, generate, generate_async
from easy_ai_clients.models import TextGenerationRequest, TextGenerationResult

Function	Description
`generate(provider, instructions, ...)`	Generate text from a single prompt. Accepts `input_images` for vision.
`generate_async(...)`	Async variant.
`batch_generate(requests, ...)`	Run multiple requests to the same provider concurrently.
`batch_generate_async(...)`	Async variant.

`easy_ai_clients.audio`

from easy_ai_clients.audio import (
    compose, compose_async,
    synthesize, synthesize_async,
    transcribe, transcribe_async,
)
from easy_ai_clients.models import (
    MusicGenerationRequest, MusicGenerationResult,
    SpeechSynthesisRequest, SpeechSynthesisResult,
    SpeechTranscriptionRequest, SpeechTranscriptionResult,
)

Function	Description
`transcribe(provider, audio, ...)`	Transcribe audio to text with word timings and speaker diarization.
`synthesize(provider, text, ...)`	Synthesize speech audio from text.
`compose(provider, prompt, ...)`	Generate instrumental music from a text prompt.

`easy_ai_clients.image`

from easy_ai_clients.image import (
    compose, compose_async,
    edit, edit_async,
    generate, generate_async,
    transform, transform_async,
)
from easy_ai_clients.models import (
    ImageCompositionRequest,
    ImageEditRequest,
    ImageGenerationRequest,
    ImageResult,
    ImageTransformationRequest,
)

Function	Description
`generate(provider, prompt, ...)`	Generate an image from a text prompt.
`transform(provider, prompt, image, ...)`	Transform an input image guided by a prompt.
`compose(provider, prompt, image, reference_image, ...)`	Combine a base image with a reference image using a prompt.
`edit(provider, prompt, image, ...)`	Edit an image region with an optional mask.

`easy_ai_clients.video`

from easy_ai_clients.video import generate, generate_async, lipsync, lipsync_async
from easy_ai_clients.models import LipSyncRequest, VideoGenerationRequest, VideoResult

Function	Description
`generate(provider, output_path, ...)`	Generate a video. Accepts optional `prompt`, `image`, and `audio`.
`lipsync(provider, image, audio, output_path, ...)`	Animate a face image with a supplied audio track.

Exceptions

from easy_ai_clients.exceptions import (
    ConfigurationError,
    EasyAiClientError,
    IncompatibleParameterError,
    InvalidParameterError,
    InvalidProviderResponseError,
    JobFailedError,
    MissingCredentialError,
    PricingUnavailableError,
    ProviderTimeoutError,
    TemporaryDownloadError,
    UnsupportedModelError,
    UnsupportedProviderError,
)

See docs/errors.md for descriptions and handling patterns.

Provider matrix

Text

Provider	Env var
`openai`	`OPENAI_API_KEY`
`groq`	`GROQ_API_KEY`
`together`	`TOGETHER_API_KEY`
`fireworks`	`FIREWORKS_API_KEY`
`deepseek`	`DEEPSEEK_API_KEY`
`openrouter`	`OPENROUTER_API_KEY`
`xai`	`XAI_API_KEY`
`mistral`	`MISTRAL_API_KEY`
`anthropic`	`ANTHROPIC_API_KEY`
`google`	`GOOGLE_API_KEY`
`cohere`	`COHERE_API_KEY`
`perplexity`	`PERPLEXITY_API_KEY`
`deepinfra`	`DEEPINFRA_API_KEY`
`huggingface`	`HUGGINGFACE_API_KEY`

Audio — transcription

Provider	Env var
`deepgram`	`DEEPGRAM_API_KEY`
`assemblyai`	`ASSEMBLYAI_API_KEY`
`speechmatics`	`SPEECHMATICS_API_KEY`
`revai`	`REVAI_API_KEY`

Audio — synthesis

Provider	Env var
`cartesia`	`CARTESIA_API_KEY`
`azure`	`AZURE_SPEECH_API_KEY`, `AZURE_SPEECH_REGION`
`hume`	`HUME_API_KEY`
`elevenlabs`	`ELEVENLABS_API_KEY`
`murf`	`MURF_API_KEY`

Audio — music

Provider	Env var
`google`	`GOOGLE_API_KEY`
`elevenlabs`	`ELEVENLABS_API_KEY`
`stability`	`STABILITY_API_KEY`
`beatoven`	`BEATOVEN_API_KEY`
`loudly`	`LOUDLY_API_KEY`

Image — generate

Provider	Env var
`openai`	`OPENAI_API_KEY`
`google`	`GOOGLE_API_KEY`
`bfl`	`BFL_API_KEY`
`ideogram`	`IDEOGRAM_API_KEY`
`stability`	`STABILITY_API_KEY`
`hedra`	`HEDRA_API_KEY`

Image — transform

Provider	Env var
`openai`	`OPENAI_API_KEY`
`google`	`GOOGLE_API_KEY`
`bfl`	`BFL_API_KEY`
`ideogram`	`IDEOGRAM_API_KEY`
`stability`	`STABILITY_API_KEY`

Image — compose

Provider	Env var
`google`	`GOOGLE_API_KEY`
`bfl`	`BFL_API_KEY`

Image — edit

Provider	Env var
`openai`	`OPENAI_API_KEY`
`google`	`GOOGLE_API_KEY`
`bfl`	`BFL_API_KEY`
`ideogram`	`IDEOGRAM_API_KEY`
`stability`	`STABILITY_API_KEY`

Video — generate without audio

Provider	Env var
`runway`	`RUNWAYML_API_SECRET`
`luma`	`LUMA_API_KEY`
`fal`	`FAL_KEY`
`hedra`	`HEDRA_API_KEY`

Video — generate with audio

Provider	Env var
`google`	`GOOGLE_API_KEY`
`heygen`	`HEYGEN_API_KEY`
`did`	`DID_API_KEY`
`hedra`	`HEDRA_API_KEY`

Video — lip sync

Provider	Env var
`heygen`	`HEYGEN_API_KEY`
`did`	`DID_API_KEY`
`hedra`	`HEDRA_API_KEY`

Error handling

easy-ai-clients uses typed exceptions so your code can handle failures intentionally.

from easy_ai_clients.exceptions import MissingCredentialError, UnsupportedProviderError
from easy_ai_clients.text import generate

try:
    result = generate(provider="openai", instructions="Write one sentence.")
except MissingCredentialError as exc:
    # exc.provider — the canonical provider name
    # exc.env_vars — tuple of env var names that must be set
    print(f"Missing credentials for {exc.provider}: {exc.env_vars}")
except UnsupportedProviderError as exc:
    print(f"Unknown provider: {exc}")

A MissingCredentialError for one provider does not affect other providers. Package imports and unrelated operations remain fully usable.

See docs/errors.md for the full exception hierarchy and handling recommendations.

Provider aliases

Many providers accept common aliases:

Alias	Resolves to
`gemini`, `google-ai`, `imagen`, `veo`, `lyria`	`google`
`flux`, `black-forest-labs`	`bfl`
`mistralai`	`mistral`
`runwayml`, `runway-ml`	`runway`
`lumalabs`, `luma-dream-machine`	`luma`
`hey-gen`	`heygen`
`d-id`	`did`
`eleven-labs`	`elevenlabs`
`assembly-ai`	`assemblyai`
`pplx`	`perplexity`
`deep-infra`	`deepinfra`
`hugging-face`, `hf`	`huggingface`

Additional docs

docs/configuration.md — credential resolution, python-dotenv integration, per-call overrides
docs/providers.md — complete credential tables for every provider and operation
docs/errors.md — exception hierarchy, handling patterns, retry guidance

Contributing

See CONTRIBUTING.md for development setup, running tests, and the release process.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.4.1

Apr 25, 2026

0.4.0

Apr 25, 2026

This version

0.3.0

Apr 20, 2026

0.2.0

Apr 18, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

easy_ai_clients-0.3.0.tar.gz (59.5 kB view details)

Uploaded Apr 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

easy_ai_clients-0.3.0-py3-none-any.whl (68.5 kB view details)

Uploaded Apr 20, 2026 Python 3

File details

Details for the file easy_ai_clients-0.3.0.tar.gz.

File metadata

Download URL: easy_ai_clients-0.3.0.tar.gz
Upload date: Apr 20, 2026
Size: 59.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for easy_ai_clients-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`3a2cc96114012355f5b27bea28df950fb8a77bb14c67850641f0eacd44d8a682`
MD5	`442a187d9594d7b2a8e006e630689070`
BLAKE2b-256	`2905a418414dfb7c6c8589437cc5b1808a3a7c11a1dae324cf1e712bf657d7dd`

See more details on using hashes here.

File details

Details for the file easy_ai_clients-0.3.0-py3-none-any.whl.

File metadata

Download URL: easy_ai_clients-0.3.0-py3-none-any.whl
Upload date: Apr 20, 2026
Size: 68.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for easy_ai_clients-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`37b5f4b893ab2cb0ff412745ab808774bb7c90ebf329eb8076f7771df7194f79`
MD5	`e940569546823030cf65389c311c75bb`
BLAKE2b-256	`af48096e128e9e9360227f10a613298f29b69de6e7aea635762390250a5d5547`

See more details on using hashes here.

easy-ai-clients 0.3.0

Navigation

Verified details

Project links

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

easy-ai-clients

Requirements

Install

Checking the version

Configuration

Environment variable setup

Quickstart

Text generation

Text generation with image input (vision)

Speech transcription

Speech synthesis

Music generation

Image generation

Image transformation

Image composition

Image editing

Video generation

Lip sync

Stateful client

Async example

Public API

easy_ai_clients.text

easy_ai_clients.audio

easy_ai_clients.image

easy_ai_clients.video

Exceptions

Provider matrix

Text

Audio — transcription

Audio — synthesis

Audio — music

Image — generate

Image — transform

Image — compose

Image — edit

Video — generate without audio

Video — generate with audio

Video — lip sync

Error handling

Provider aliases

Additional docs

Contributing

Project details

Verified details

Project links

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`easy_ai_clients.text`

`easy_ai_clients.audio`

`easy_ai_clients.image`

`easy_ai_clients.video`