Skip to main content

Unified access layer for completion and embedding services

Project description

ai-api-unified - a Vendor-Agnostic AI Services Library

Latest version: 1.3.0

ai-api-unified is a unified, typed client for Completions, Embeddings, and Voice that lets you switch providers by changing configuration, not code. Your app targets stable base interfaces; factories select concrete providers at runtime based on environment variables. This keeps call sites clean and makes vendor swaps low-risk.

Key idea: Write to the base interfaces (completions, embeddings, voice). Change the provider via env/config only.


Model Categories & Capability Matrix

Category Base Interface Providers (examples) Required Poetry extra(s)
Completions AIBaseCompletions OpenAI, AWS Bedrock (Nova), Google Gemini OpenAI: none; Bedrock: bedrock; Gemini: google_gemini
Embeddings AIBaseEmbeddings OpenAI, Amazon Titan, Google Gemini OpenAI: none; Titan: bedrock; Gemini: google_gemini
Voice TTS AIVoiceBase OpenAI, Google Vertex AI Gemini TTS, Azure TTS, ElevenLabs Google: google_gemini (Vertex Gemini SDK); Azure: azure_tts; ElevenLabs: elevenlabs
Voice STT AIVoiceBase (if enabled) Google (and others if configured) Typically google_gemini

Only OpenAI works with the base package alone. All other providers require the appropriate Poetry extra(s). Install extras with: poetry add 'ai-api-unified[<extra name>]'


Table of Contents

  1. Supported Providers & Models
  2. Installation 2.1. Python & System Requirements 2.2. Choose Provider Extras 2.3. Smoke Test
  3. Quickstart: Factory → Client → Use It 3.1. Completions 3.2. Embeddings 3.3. Voice TTS
  4. Configuration & Environment Variables 4.1. Global Selectors 4.2. Provider-Specific Variables 4.3. Geo-restricted Data Residency
  5. Vendor Setup Guides 5.1. OpenAI 5.2. AWS Bedrock & Amazon Titan 5.3. Google Gemini 5.4. Azure Cognitive Services TTS 5.5. ElevenLabs
  6. API Programming Guide 6.1. Factories & Clients 6.2. Completions API 6.3. Embeddings API 6.4. Voice API 6.5. Structured Prompts 6.6. Token Counting & Cost Hints
  7. Advanced Topics 7.1. Retries & Backoff
  8. Class Flow Diagram
  9. Repository Layout, Tests, Examples
  10. Troubleshooting
  11. Versioning & License

Supported Providers & Models

  • OpenAI

    • Completions: current GPT-4.x / 4o family
    • Embeddings: text-embedding-3-small / -large
    • Voice: OpenAI TTS
  • AWS Bedrock & Amazon Titan

    • Completions: Nova family (and other Bedrock models configured in your env)
    • Embeddings: Titan (amazon.titan-embed-text-v2:0, etc.)
  • Google Gemini

    • Completions: Gemini 1.5 / 2.x family
    • Embeddings: gemini-embedding-001
    • Voice: Google Cloud Text-to-Speech
  • Azure Cognitive Services (TTS)

  • ElevenLabs (TTS)

Exact model names are selected by env variables in this library; see Configuration and Vendor Guides.


Installation

Python & System Requirements

Use a supported Python version per your project settings. If deploying to AWS Lambda, consult Troubleshooting for wheel guidance.

Choose Provider Extras

OpenAI only: > poetry add 'ai-api-unified'

Google Gemini (completions or embeddings, and Google TTS/STT): > poetry add 'ai-api-unified[google_gemini]'

AWS Bedrock (Nova) and Amazon Titan (embeddings): > poetry add 'ai-api-unified[bedrock]'

Azure TTS: > poetry add 'ai-api-unified[azure_tts]'

ElevenLabs TTS: > poetry add 'ai-api-unified[elevenlabs]'

Optional similarity helpers (NumPy, etc.): > poetry add 'ai-api-unified[similarity_score]'

Multiple extras: poetry add 'ai-api-unified[bedrock,google_gemini]'

Why: non-OpenAI providers ship heavier SDKs; extras keep default installs slim.

Smoke Test

python -c "import ai_api_unified as m; print('ok')"

Why: quick import check to confirm resolver and credentials are correct.


Quickstart: Factory → Client → Use It

Set the global selectors and provider credentials before running the examples. All required env vars are shown above each snippet.

Completions

# Required env (OpenAI):
#   COMPLETIONS_ENGINE=openai
#   OPENAI_API_KEY=...
# Optional:
#   COMPLETIONS_MODEL_NAME=gpt-4o-mini

from ai_api_unified.ai_factory import AIFactory
from ai_api_unified.ai_base import AIBaseCompletions

client: AIBaseCompletions = AIFactory.get_ai_completions_client()
text: str = client.send_prompt("Say hello in one short sentence.")
print(text)

What it does and why: obtains a provider-selected completions client using env settings; your code speaks only the base interface, so vendor swaps require no code changes.

Embeddings

# Required env (OpenAI example):
#   EMBEDDING_ENGINE=openai
#   OPENAI_API_KEY=...
# Optional:
#   EMBEDDING_MODEL_NAME=text-embedding-3-small
#   EMBEDDING_DIMENSIONS=1536

from ai_api_unified.ai_factory import AIFactory
from ai_api_unified.ai_base import AIBaseEmbeddings

emb: AIBaseEmbeddings = AIFactory.get_ai_embedding_client()
res: dict[str, object] = emb.generate_embeddings("hello world")
vector = res.get("embedding")
print(len(vector) if vector else None)

What it does and why: generates a vector under a stable key suitable for storage or downstream retrieval. Dimensions and model are env-configurable.

Voice TTS

# Example using Google TTS:
#   poetry add 'ai-api-unified[google_gemini]'
# Required env:
#   AI_VOICE_ENGINE=google
#   GOOGLE_APPLICATION_CREDENTIALS=/path/to/service_account.json
#   GOOGLE_PROJECT_ID=<gcp-project-id>
# Optional:
#   GOOGLE_LOCATION=us-central1
#   AI_VOICE_LANGUAGE=en-US

from ai_api_unified.ai_voice_factory import AIVoiceFactory
from ai_api_unified.ai_voice_base import AIVoiceBase

voice: AIVoiceBase = AIVoiceFactory.get_voice_client()
audio_bytes: bytes = voice.text_to_speech("Hello from a unified voice API")
with open("out.wav", "wb") as f:
    f.write(audio_bytes)
# or
voice.play(audio_bytes)

What it does and why: selects the configured voice engine and synthesizes audio. SDKs load only when the engine is selected.


Configuration & Environment Variables

Global Selectors

These choose the provider for each category:

  • COMPLETIONS_ENGINE = openai | nova | google-gemini
  • EMBEDDING_ENGINE = openai | titan | google-gemini
  • AI_VOICE_ENGINE = openai | google | azure | elevenlabs

Common optional knobs:

  • COMPLETIONS_MODEL_NAME (e.g., gpt-4o-mini, gemini-3.0-flash, amazon.nova-lite-v1:0)
  • EMBEDDING_MODEL_NAME (e.g., text-embedding-3-small, amazon.titan-embed-text-v2:0, gemini-embedding-001)
  • EMBEDDING_DIMENSIONS (provider-specific defaults, e.g., 1536 for OpenAI text-embedding-3-small)

Provider-Specific Variables

OpenAI (no extra needed)

  • OPENAI_API_KEY
  • Optional: COMPLETIONS_MODEL_NAME, EMBEDDING_MODEL_NAME, EMBEDDING_DIMENSIONS

AWS Bedrock & Amazon Titan (requires extra: bedrock)

  • AWS_REGION (e.g., us-east-1)
  • Optional: COMPLETIONS_MODEL_NAME (Nova), EMBEDDING_MODEL_NAME (Titan), EMBEDDING_DIMENSIONS

Google Gemini (requires extra: google_gemini)

You can authenticate using either an API Key or Google Application Default Credentials (ADC).

Option A: API Key (Consumer-grade)

  • GOOGLE_AUTH_METHOD=api_key
  • GOOGLE_GEMINI_API_KEY (your Gemini API key)

Option B: Application Default Credentials / Vertex AI

  • GOOGLE_AUTH_METHOD=service_account (or leave unset)
  • GOOGLE_APPLICATION_CREDENTIALS (path to service account JSON)
  • Optional: GOOGLE_PROJECT_ID, GOOGLE_LOCATION

Common Options:

  • Optional: COMPLETIONS_MODEL_NAME, EMBEDDING_MODEL_NAME, EMBEDDING_DIMENSIONS

Azure Cognitive Services TTS (requires extra: azure_tts)

  • MICROSOFT_COGNITIVE_SERVICES_API_KEY
  • MICROSOFT_COGNITIVE_SERVICES_REGION
  • Optional: MICROSOFT_COGNITIVE_SERVICES_ENDPOINT, AI_VOICE_LANGUAGE

ElevenLabs TTS (requires extra: elevenlabs)

  • ELEVEN_LABS_API_KEY
  • Optional: voice selection variables used by your implementation

Voice (common option)

  • AI_VOICE_LANGUAGE (e.g., en-US)

Geo-restricted Data Residency

If your deployment must keep data in the U.S., set:

  • AI_API_GEO_RESIDENCY=US (or USA / United States)

Providers that support regional endpoints will route via U.S. URLs when this is set; others may log a warning if the SDK does not expose regional control.


Vendor Setup Guides

OpenAI

Install: poetry add 'ai-api-unified'

Env:

COMPLETIONS_ENGINE=openai
EMBEDDING_ENGINE=openai
OPENAI_API_KEY=...
# Optional:
COMPLETIONS_MODEL_NAME=gpt-4o-mini
EMBEDDING_MODEL_NAME=text-embedding-3-small
EMBEDDING_DIMENSIONS=1536
AI_API_GEO_RESIDENCY=US   # optional

Sanity check:

from ai_api_unified.ai_factory import AIFactory
c = AIFactory.get_ai_completions_client()
print(c.send_prompt("Ping"))

Why: verifies key wiring and model selection.


AWS Bedrock & Amazon Titan

Install: poetry add 'ai-api-unified[bedrock]'

Env:

COMPLETIONS_ENGINE=nova
EMBEDDING_ENGINE=titan
AWS_REGION=us-east-1
AWS_ACCESS_KEY_ID=...
AWS_SECRET_ACCESS_KEY=...
# Optional if using temporary creds:
AWS_SESSION_TOKEN=...
# Optional:
COMPLETIONS_MODEL_NAME=amazon.nova-lite-v1:0
EMBEDDING_MODEL_NAME=amazon.titan-embed-text-v2:0
EMBEDDING_DIMENSIONS=1024
AI_API_GEO_RESIDENCY=US   # optional

Sanity check:

from ai_api_unified.ai_factory import AIFactory
c = AIFactory.get_ai_completions_client()
print(c.send_prompt("Ping from Bedrock"))

Why: confirms region and IAM credentials are valid.


Google Gemini

Install: poetry add 'ai-api-unified[google_gemini]'

Env:

Configure your preferred authentication method:

Option A: API Key (Consumer-grade)

GOOGLE_AUTH_METHOD=api_key
GOOGLE_GEMINI_API_KEY=...

Option B: Application Default Credentials / Vertex AI

GOOGLE_AUTH_METHOD=service_account
GOOGLE_APPLICATION_CREDENTIALS=/path/service_account.json
# Optional:
GOOGLE_PROJECT_ID=...
GOOGLE_LOCATION=us-central1

Common Settings:

COMPLETIONS_ENGINE=google-gemini
EMBEDDING_ENGINE=google-gemini
COMPLETIONS_MODEL_NAME=gemini-3.0-flash
EMBEDDING_MODEL_NAME=gemini-embedding-001
EMBEDDING_DIMENSIONS=3072
AI_API_GEO_RESIDENCY=US   # optional; may log warning if SDK lacks regional control

Sanity check:

from ai_api_unified.ai_factory import AIFactory
c = AIFactory.get_ai_completions_client()
print(c.send_prompt("Ping from Gemini"))

Why: verifies service account auth and model selection.


Azure Cognitive Services TTS

Install: poetry add 'ai-api-unified[azure_tts]'

Env:

AI_VOICE_ENGINE=azure
MICROSOFT_COGNITIVE_SERVICES_API_KEY=...
MICROSOFT_COGNITIVE_SERVICES_REGION=...
# Optional:
MICROSOFT_COGNITIVE_SERVICES_ENDPOINT=...
AI_VOICE_LANGUAGE=en-US

Sanity check:

from ai_api_unified.ai_voice_factory import AIVoiceFactory
v = AIVoiceFactory.get_voice_client()
open("azure.wav","wb").write(v.text_to_speech("Azure TTS ready"))

Why: synthesizes a short WAV to confirm credentials and region.


ElevenLabs

Install: poetry add 'ai-api-unified[elevenlabs]'

Env:

AI_VOICE_ENGINE=elevenlabs
ELEVEN_LABS_API_KEY=...
# Optional: voice selection variables if supported

Sanity check:

from ai_api_unified.ai_voice_factory import AIVoiceFactory
v = AIVoiceFactory.get_voice_client()
open("11labs.wav","wb").write(v.text_to_speech("Testing ElevenLabs"))

Why: produces short audio to verify API key wiring.


API Programming Guide

Design principle: your code targets the base interfaces. Factories return provider-specific implementations based on env/config.

Factories & Clients

Typical entry points (check your code for the authoritative signatures):

from ai_api_unified.ai_factory import AIFactory
from ai_api_unified.ai_base import AIBaseCompletions, AIBaseEmbeddings

completions_client: AIBaseCompletions = AIFactory.get_ai_completions_client()
embedding_client:  AIBaseEmbeddings  = AIFactory.get_ai_embedding_client()

Why: centralizes provider selection and keeps your business logic provider-agnostic.

Completions API

Common methods exposed by the base layer:

  • send_prompt(prompt: str, *, other_params: AICompletionsPromptParamsBase | None = None) -> str
  • strict_schema_prompt(prompt: str, response_model: type[AIStructuredPrompt], max_response_tokens: int = 512, *, other_params: AICompletionsPromptParamsBase | None = None) -> AIStructuredPrompt

Free-form example:

resp: str = comp.send_prompt("Say hello in Spanish.")
print(resp)

Why: simplest way to get text output across providers.

Got it. Here’s a crisp, library-specific description that matches your code and explains the point of the feature without fluff.

Structured Prompting (Schema-validated)

What it is: A way to turn an LLM call into a typed function: you declare the expected output as a Pydantic model (subclassing AIStructuredPrompt), and the library enforces that the model returned by the LLM conforms to your JSON schema before you ever touch the data.

Why it exists:

  • You get a strong contract for LLM outputs (types + required fields).
  • Your prompt logic, output schema, and runtime call live in one place you can unit-test.
  • It’s provider-agnostic: the same pattern works no matter which completions engine you select via env/config.

Creating a structured prompt, a Tutorial:

  1. Define a structured prompt type Example: NameAgeCityStructuredPrompt(AIStructuredPrompt) with:

    • Input field: input_text (the raw text to extract from).
    • Output fields: name, age, city (start as None and will be populated by the LLM if valid).
  2. Build the natural-language prompt on the instance

    • get_prompt(input_text: str) -> str returns the instruction (“Extract the name, age, and city from the following text: …”).
    • An @model_validator(mode="after") sets self.prompt from get_prompt(...) so the prompt is derived from the instance’s inputs and always stays in sync.
  3. Declare the output JSON schema

    • Override model_json_schema() to describe only the LLM’s output (not the inputs).
    • Add name (string), age (integer), city (string) and mark them required.
    • This schema is what the LLM is instructed to produce and what the library uses to validate the response.
  4. Call the LLM through the base completions client

    • structured_prompt.send_structured_prompt(ai_completions_client, NameAgeCityStructuredPrompt) sends:

      • the prompt (from step 2), and
      • the output schema (from step 3).
    • Under the hood, the client routes to the selected provider and requests schema-conformant JSON.

  5. Validation & parsing

    • The raw LLM JSON is parsed and validated against your schema.
    • On success, you get a typed instance of NameAgeCityStructuredPrompt with name/age/city populated.
  6. Use the typed result

    • Access structured_prompt_result.name, .age, .city directly—no ad-hoc JSON handling.

What guarantees you get:

  • If the model fails to provide the required fields or violates types (e.g., age isn’t an integer), the call fails fast with a validation error, rather than leaking malformed data into your pipeline.
  • The same schema + prompt pattern works across OpenAI/Gemini/Bedrock because it’s implemented in the base completions interface.

When to use it:

  • Any time you need extract-transform style outputs (entities, classifications, records) you plan to store or forward.
  • Anywhere a downstream consumer expects fields with types, not free-form text.

Structured Prompt code sample

from pydantic import BaseModel
from ai_api_unified.ai_base import AIStructuredPrompt

class NameAgeCityStructuredPrompt(AIStructuredPrompt):
    """Example structured prompt for testing."""

    name: str | None = None
    age: int | None = None
    city: str | None = None
    input_text: str | None = None

    @model_validator(mode="after")
    def _populate_prompt(
        self: "NameAgeCityStructuredPrompt", __: Any
    ) -> "NameAgeCityStructuredPrompt":
        object.__setattr__(
            self,
            "prompt",
            self.get_prompt(input_text=self.input_text),
        )
        return self

    @staticmethod
    def get_prompt(input_text: str) -> str:
        prompt: str = textwrap.dedent(
            f"""
            Extract the name, age, and city from the following text:
            {input_text}
            """
        ).strip()
        return prompt

    @classmethod
    def model_json_schema(cls) -> dict[str, Any]:
        """
        JSON schema for the LLM’s *output* only.
        """
        # start with a fresh copy of the base schema (deep-copied there)
        schema: dict[str, Any] = deepcopy(super().model_json_schema())
        # add the output field
        schema["properties"]["name"] = {"type": "string"}
        schema["properties"]["age"] = {"type": "integer"}
        schema["properties"]["city"] = {"type": "string"}
        schema.setdefault("required", [])
        schema["required"].append("name")
        schema["required"].append("age")
        schema["required"].append("city")
        return schema

def structured_prompt(ai_completions_client: AIBaseCompletions) -> None:

    text: str = "My name is Alice, I am 30 years old, and I live in Paris."

    structured_prompt = GenericStructuredPromptTest(input_text=text)
    structured_prompt_result: GenericStructuredPromptTest = (
        structured_prompt.send_structured_prompt(
            ai_completions_client, GenericStructuredPromptTest
        )
    )
    print("\nAsking the LLM to extract name, age, and city from the following text:")
    print(text)
    print(f"\nName: {structured_prompt_result.name}")
    print(f"Age: {structured_prompt_result.age}")
    print(f"City: {structured_prompt_result.city}")

Why: requests a structured response that can be safely parsed and validated.

Embeddings API

Common methods:

  • generate_embeddings(text: str) -> dict[str, object]
  • generate_embeddings_batch(texts: list[str]) -> list[dict[str, object]]

Example:

r = emb.generate_embeddings("The quick brown fox")
vec = r.get("embedding")
print(type(vec), len(vec) if vec else None)

Why: produces vectors for retrieval, clustering, or similarity search.

Similarity score: if your project computes similarity, install the similarity_score extra. Use your existing utilities or the project’s examples to compute scores; do not re-implement here.

Voice API

TTS example:

audio = voice.text_to_speech("Unified voice makes provider swaps trivial")
open("voice.wav","wb").write(audio)

Why: the same call works across voice providers chosen via env.

Structured Prompts

  • Define a Pydantic model that represents the desired output.
  • Call strict_schema_prompt(...) with response_model=YourModel.
  • Handle validation errors as you would in any Pydantic workflow.

Why: consistent structured outputs across providers.

Token Counting & Cost Hints

The base layer includes token counting utilities and model metadata (context limits, price hints) so you can size prompts safely and estimate costs. Use these to enforce guardrails before sending requests.


Advanced Topics

Retries & Backoff

  • Transient errors (rate limits, 5xx, network) are retried with exponential backoff.
  • Configure attempt counts and backoff intervals via env or constructor knobs exposed by your concrete clients.

Why: smooths over brief provider outages without complicating call sites.


Class Flow Diagram

flowchart LR
  subgraph App Code
    Caller[Your app code]
  end

  subgraph Unified Library
    AF[AIFactory] --> IFace[Base interfaces<br>Completions, Embeddings,  Voice]
    IFace --> OA[OpenAI completions client]
    IFace --> OAE[OpenAI embeddings client]
    IFace --> OAV[OpenAI TTS client]
    IFace --> GG[Google Gemini completions client]
    IFace --> GGE[Google Gemini embeddings client]
    IFace --> GGV[Google Gemini TTS client]
    IFace --> AB[AWS Bedrock completions client]
    IFace --> ABE[AWS Bedrock Titan embeddings client]
    IFace --> AZ[Azure TTS client]
    IFace --> ELV[ElevenLabs TTS client]
  end

  Caller --> AF

Why this matters: shows the provider swap enabled by configuration while your app stays focused on base interfaces.


Repository Layout, Tests, Examples

  • Explore tests for runnable patterns (completions, embeddings, voice).
  • Local dev: poetry install --with devpytest -q.

Why: tests double as examples and guard against regressions.


Troubleshooting

  • Credential issues: verify env vars and scope; ensure files (e.g., GOOGLE_APPLICATION_CREDENTIALS) are mounted in containerized deployments. GCP requires a local json file so if you have creds stored in a secrets manager you will need to save this to a temp file and point your config at that filename.

Publishing

To publish a new version to PyPI, follow these steps:

  1. Configure Poetry Auth (One-time setup): Ensure Poetry is configured with your PyPI API token:

    poetry config pypi-token.pypi <your-pypi-token>
    
  2. Bump the Version: Update the version number to the new release version in both:

    • pyproject.toml (e.g., version = "1.0.2")
    • src/ai_api_unified/__version__.py
  3. Run the Publish Script: Execute the included bash script. It will verify that your git working tree is clean, ask for confirmation of the version, clean out old build artifacts, and publish the new version via Poetry.

    ./publish.sh
    
  4. Tag the Release: After a successful publish, it is strongly recommended to tag the release in git:

    git tag v1.0.2
    git push origin v1.0.2
    

Versioning & License

  • Semantic versioning.
  • License as specified in the repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_api_unified-1.3.0.tar.gz (79.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ai_api_unified-1.3.0-py3-none-any.whl (91.1 kB view details)

Uploaded Python 3

File details

Details for the file ai_api_unified-1.3.0.tar.gz.

File metadata

  • Download URL: ai_api_unified-1.3.0.tar.gz
  • Upload date:
  • Size: 79.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.4 CPython/3.13.6 Darwin/24.6.0

File hashes

Hashes for ai_api_unified-1.3.0.tar.gz
Algorithm Hash digest
SHA256 cc566cf05b8020ce2278fce5da52606fa6bfa05108ec311df6535d448a0e931b
MD5 3c4539ab5150359ec32459e0261b546f
BLAKE2b-256 ad4f415c745b1badc01cad5490b2fe3f29396e203a2738a91304029089e4d2db

See more details on using hashes here.

File details

Details for the file ai_api_unified-1.3.0-py3-none-any.whl.

File metadata

  • Download URL: ai_api_unified-1.3.0-py3-none-any.whl
  • Upload date:
  • Size: 91.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.4 CPython/3.13.6 Darwin/24.6.0

File hashes

Hashes for ai_api_unified-1.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a50858af09fae195f75bc4493a6f3078b73e589fc2c856ce6cb0ec51f92b84d1
MD5 c19f369e1b9d27b5ed0ac51577022b90
BLAKE2b-256 a4ea9f98b72c4ca764eb8477ef9a6d290bff5bd10adfc97f074f714b57a2eb88

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page