Official Python SDK for PyAI — speech-to-text (Hear), text-to-speech (Speak), realtime voice agents (Omni), and call compliance (Trace).

These details have been verified by PyPI

Project links

Repository

GitHub Statistics

Maintainers

atomsai

These details have not been verified by PyPI

Project links

Project description

pyai-sdk (Python SDK)

Official Python SDK for PyAI — the all-in-one voice AI platform: lightning-fast speech-to-text, ultra-realistic text-to-speech, end-to-end realtime voice agents, and automatic call compliance. Zero third-party dependencies (standard library only); Python 3.9+.

PyAI products

Hear — Lightning-fast, telephony-native speech-to-text. Whisper-compatible transcription tuned for real phone-call audio, with live streaming partials so your app reacts mid-sentence, plus async batch transcription for big archives. POST /v1/audio/transcriptions
Speak — Ultra-realistic text-to-speech that starts speaking in tens of milliseconds. Stream lifelike, expressive voices, choose from 36 studio-quality presets, or clone any voice instantly — for free. POST /v1/audio/speech
Omni (flagship) — One API for a complete, end-to-end voice AI agent. A single WebSocket where your agent listens, thinks, and speaks — grounded in your knowledge bases and tools, with human-like turn-taking and instant barge-in — no STT, LLM, or TTS to stitch together yourself. wss://api.pyai.com/v1/omni
Trace (flagship) — The compliance API that keeps your AI agents safe. Trace automatically checks every call for HIPAA, TCPA, and PII risks (plus your own brand-voice rules), flags the exact rule broken, redacts sensitive data, and seals each call with a tamper-evident audit trail — so a risky conversation never slips through. GET /v1/trace/interactions
Cue — Realtime turn detection + knowledge-grounded context for your own stack. Bring your own LLM and voice; Cue nails the hard part — knowing the instant a speaker finishes and surfacing the right context. wss://api.pyai.com/v1/audio/transcriptions/stream
Telephony — Instant managed phone numbers for your voice agents. Provision a US number and route live calls straight into an Omni agent — no carrier contracts, no telephony glue. POST /v1/telephony/numbers

The contract is https://api.pyai.com/openapi.json. This SDK wraps it with typed errors, automatic retries, and realtime URL helpers.

Install

pip install pyai-sdk

Quickstart

import os
from pyai import PyAI, new_idempotency_key

pyai = PyAI(api_key=os.environ["PYAI_API_KEY"])

# Text-to-speech
audio = pyai.audio.speech(input="Hello from PyAI.", voice="stock_sarah_style2")
open("hello.wav", "wb").write(audio)

# Voices
voices = pyai.voices.list(gender="female")

# Async transcription (safe retry with an idempotency key)
job = pyai.transcription_jobs.create(
    audio_url="https://example.com/call.wav",
    diarize=True,
    idempotency_key=new_idempotency_key(),
)
done = pyai.transcription_jobs.get(job["job_id"])

Speak audio formats (incl. telephony G.711)

audio.speech encodes server-side into any of eight formats via response_format, so telephony callers no longer hand-roll a resampler + μ-law encoder — the audio comes back already in the shape you need:

# Twilio/SIP-ready in one param: raw 8 kHz mono μ-law, no client-side DSP.
ulaw = pyai.audio.speech(
    input="Your appointment is confirmed.",
    voice="stock_sarah_style2",
    response_format="g711_ulaw",   # -> audio/basic, forced 8 kHz
)
import base64
media_frame_payload = base64.b64encode(ulaw).decode()  # straight into Twilio

`response_format`	sample rates (Hz)	Content-Type
`mp3` (default)	8000 / 16000 / 24000 / 48000	`audio/mpeg`
`wav`	8000 / 16000 / 24000 / 48000	`audio/wav`
`opus`	8000 / 16000 / 24000 / 48000	`audio/ogg`
`aac`	8000 / 16000 / 24000 / 48000	`audio/aac`
`flac`	8000 / 16000 / 24000 / 48000	`audio/flac`
`pcm` (raw int16 LE, no header)	8000 / 16000 / 24000 / 48000	`audio/pcm`
`g711_ulaw`	8000 (forced)	`audio/basic`
`g711_alaw`	8000 (forced)	`audio/basic`

The accepted set is exported as SPEECH_FORMATS / SPEECH_SAMPLE_RATES (and a SpeechFormat Literal for type-checkers). Any other value is a 400 unsupported_format. sample_rate is optional — omit it for the engine's native 24 kHz (g711_* is always 8 kHz); omit response_format for the default mp3. See examples/speak-telephony-formats for the full before/after.

Realtime (Omni)

Keys travel as a WebSocket subprotocol. Use the helpers with your preferred WS library (e.g. websockets):

url = pyai.realtime_url(product="omni", agent_id="agent_123")
subprotocol = pyai.realtime_subprotocol()

import asyncio, websockets

async def main():
    async with websockets.connect(url, subprotocols=[subprotocol]) as ws:
        async for frame in ws:
            print(frame)

asyncio.run(main())

Omni uses the native wss://api.pyai.com/v1/omni surface (the default for product="omni"); product="flow" uses /v1/realtime. The older /v2/omni/chat URL is deprecated but still works.

Streaming speech-to-text (Hear / Cue)

The standard library has no production-grade WebSocket client, so the SDK gives you a URL builder (hear_stream_url) plus the subprotocol helper; pair them with websockets (or websocket-client). The wire protocol: stream binary PCM16/opus frames, send {"type":"commit"} to force-finalize, and read JSON frames of type partial / partial_stable / speech_final / final / error:

import asyncio, json, websockets

url = pyai.hear_stream_url(sample_rate=16000)

async def transcribe(pcm_chunks):
    async with websockets.connect(url, subprotocols=[pyai.realtime_subprotocol()]) as ws:
        async for pcm16 in pcm_chunks:
            await ws.send(pcm16)
        await ws.send(json.dumps({"type": "commit"}))
        async for frame in ws:
            print(json.loads(frame))

asyncio.run(transcribe(mic_source()))

For Cue (turn detection + KB context), send {"type": "config", "grounding": true} as the first text frame after connecting; final/speech_final frames then carry a grounding list of top KB passages.

Sync STT, telephony output, and more APIs

# Synchronous speech-to-text
text = pyai.audio.transcriptions.create(file=open("call.wav", "rb"), language="en")["text"]

# Telephony-ready TTS: raw 8 kHz G.711 for Twilio/SIP, encoded server-side —
# no client-side resampler or μ-law encoder. Just base64 it into a media frame.
ulaw = pyai.audio.speech(input="Hi there", response_format="g711_ulaw")

# Voice clones (Speak)
clone = pyai.clones.create(name="Brand VO", file=open("ref.wav", "rb"))
pyai.clones.delete(clone["id"])

# Managed phone numbers (Telephony)
avail = pyai.telephony.numbers.available(area_code="415")["data"]
num = pyai.telephony.numbers.buy(phone_number=avail[0]["phone_number"], agent_id="agent_123")
pyai.telephony.numbers.assign(num["id"], "agent_123")
pyai.telephony.numbers.release(num["id"])

# Compliance (Trace)
fails = pyai.trace.interactions.list(verdict="FAIL")["data"]
pyai.trace.config.set(agent_id="agent_123", enabled=True)
exposure = pyai.trace.exposure(window_days=30)

# Per-call eval scorecard (timeline + quality metrics). Additive and forward-
# compatible — present once the engine emits them, so reading is always safe
# (call_timeline returns [] until then).
timeline = pyai.trace.call_timeline(fails[0]["id"])              # list[dict] of turns
quality = pyai.trace.interactions.get(fails[0]["id"]).get("quality_metrics")

Reproducible runs (evals)

audio.speech and audio.transcriptions.create take optional seed and temperature for deterministic eval runs. They're forward-compatible — honored once the engine supports them and otherwise ignored — so it's always safe to pass:

pyai.audio.speech(input="Hello", voice="stock_sarah_style2", seed=42, temperature=0)
pyai.audio.transcriptions.create(file=open("call.wav", "rb"), seed=42)

CLI (`pyai`)

The package installs a pyai command (also python -m pyai). pyai doctor introspects your key/scopes via GET /v1/me (skipped gracefully if the route isn't deployed yet), checks endpoint liveness, runs a Speak→Hear round-trip, and prints remediation hints:

export PYAI_API_KEY=pyai_test_...
pyai doctor
# PASS  key (/v1/me)  — env=test; 3 scope(s): hear:transcribe, voice:synthesize, hear:stream
# PASS  speak→hear round-trip  — synth 45210 bytes → "the quick brown fox…"
# Diagnosis: healthy. Key, endpoint, and a Speak→Hear round-trip all work.

pyai smoke   # lighter: models + voices + speak

Errors

Failures raise PyAIError with a stable code (branch on it, not the message):

from pyai import PyAIError

try:
    pyai.audio.speech(input="hi")
except PyAIError as err:
    if err.code == "credit_exhausted":
        ...  # out of prepaid credit — add credit or use a sandbox key

Common codes: unauthorized, forbidden, credit_exhausted, rate_limit_exceeded, concurrency_limit_exceeded, idempotency_conflict. 429/5xx are retried automatically (honoring Retry-After); tune with PyAI(api_key, max_retries=...).

Develop

python -m unittest discover -s tests -v   # no network; transport injected

Project details

These details have been verified by PyPI

Project links

Repository

GitHub Statistics

Maintainers

atomsai

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.2.0

Jun 17, 2026

0.1.2

Jun 16, 2026

0.1.1

Jun 16, 2026

0.1.0

Jun 16, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyai_sdk-0.2.0.tar.gz (19.7 kB view details)

Uploaded Jun 17, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pyai_sdk-0.2.0-py3-none-any.whl (16.9 kB view details)

Uploaded Jun 17, 2026 Python 3

File details

Details for the file pyai_sdk-0.2.0.tar.gz.

File metadata

Download URL: pyai_sdk-0.2.0.tar.gz
Upload date: Jun 17, 2026
Size: 19.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pyai_sdk-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`1659270c91b900cd2fc76e8941b4f17a6c86bcfc4de2924a795a134debe77869`
MD5	`bda78adb84518cded4e61f494770dc4d`
BLAKE2b-256	`f23d0570017b02fba8f0b95f2209fd004c4380f3f5d254786737dca813e589dc`

See more details on using hashes here.

Provenance

The following attestation bundles were made for pyai_sdk-0.2.0.tar.gz:

Publisher: publish-sdk-pypi.yml on atomsai/pyai-platform-backend

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: pyai_sdk-0.2.0.tar.gz
- Subject digest: 1659270c91b900cd2fc76e8941b4f17a6c86bcfc4de2924a795a134debe77869
- Sigstore transparency entry: 1851133880
- Sigstore integration time: Jun 17, 2026
Source repository:
- Permalink: atomsai/pyai-platform-backend@165fac2b5bca55aa97ed6c82ec9ec7b73c0ffffd
- Branch / Tag: refs/tags/sdk-py-v0.2.0
- Owner: https://github.com/atomsai
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-sdk-pypi.yml@165fac2b5bca55aa97ed6c82ec9ec7b73c0ffffd
- Trigger Event: push

File details

Details for the file pyai_sdk-0.2.0-py3-none-any.whl.

File metadata

Download URL: pyai_sdk-0.2.0-py3-none-any.whl
Upload date: Jun 17, 2026
Size: 16.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pyai_sdk-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c772b3680fa995140c93933fe93aeb139a978398efd171d47aedd4ff50bf3e3c`
MD5	`542f74fa0baa7d01b71f577cf156fa0f`
BLAKE2b-256	`3f40582fc33edaef0d49c1732a92956da84cb99a8d97ef9ed418fe94c9566ba5`

See more details on using hashes here.

Provenance

The following attestation bundles were made for pyai_sdk-0.2.0-py3-none-any.whl:

Publisher: publish-sdk-pypi.yml on atomsai/pyai-platform-backend

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: pyai_sdk-0.2.0-py3-none-any.whl
- Subject digest: c772b3680fa995140c93933fe93aeb139a978398efd171d47aedd4ff50bf3e3c
- Sigstore transparency entry: 1851133977
- Sigstore integration time: Jun 17, 2026
Source repository:
- Permalink: atomsai/pyai-platform-backend@165fac2b5bca55aa97ed6c82ec9ec7b73c0ffffd
- Branch / Tag: refs/tags/sdk-py-v0.2.0
- Owner: https://github.com/atomsai
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-sdk-pypi.yml@165fac2b5bca55aa97ed6c82ec9ec7b73c0ffffd
- Trigger Event: push

pyai-sdk 0.2.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Project description

pyai-sdk (Python SDK)

PyAI products

Install

Quickstart

Speak audio formats (incl. telephony G.711)

Realtime (Omni)

Streaming speech-to-text (Hear / Cue)

Sync STT, telephony output, and more APIs

Reproducible runs (evals)

CLI (pyai)

Errors

Develop

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

CLI (`pyai`)