Skip to main content

A modular Python library for voice interactions with AI systems

Project description

AbstractVoice

A modular Python library that abstracts TTS, STT, and voice cloning across multiple engines — designed for offline-first AI applications.

  • TTS (default): Piper (cross-platform, no system deps)
  • STT (default): faster-whisper
  • Local assistant: listen() + speak() with playback/listening control
  • Headless/server: speak_to_bytes() / speak_to_file() and transcribe_*
  • Voice cloning (optional): OpenF5, Chroma, AudioDiT, OmniVoice (engine-bound cloned voices)

Status: alpha (0.7.0). The supported integrator surface is documented in docs/api.md.

Next: docs/getting-started.md (recommended setup + first smoke tests).

Standalone vs AbstractCore / AbstractFramework

AbstractVoice can be used standalone (library + REPL), and it is also designed to be used as a capability plugin backend for AbstractCore (and therefore the wider AbstractFramework ecosystem).

Key links:

  • AbstractCore (agents/capabilities): https://abstractcore.ai and https://github.com/lpalbou/abstractcore
  • AbstractFramework (umbrella): https://github.com/lpalbou/abstractframework

Integration points (code evidence):

  • AbstractCore capability plugin entry point: pyproject.toml[project.entry-points."abstractcore.capabilities_plugins"]
    Implementation: abstractvoice/integrations/abstractcore_plugin.py
  • AbstractRuntime ArtifactStore adapter (optional, duck-typed): abstractvoice/artifacts.py

Important: AbstractVoice is a voice I/O library (TTS/STT + optional cloning). It is not an agent framework and it does not implement an LLM server. In the AbstractFramework stack, AbstractCore is the intended place to run agents and expose OpenAI-compatible endpoints; AbstractVoice is discovered as a plugin and provides the voice implementation.

flowchart LR
  App["Your app / REPL"] --> VM["abstractvoice.VoiceManager"]
  VM --> TTS["Piper TTS"]
  VM --> STT["faster-whisper STT"]
  VM --> IO["sounddevice / PortAudio"]

  subgraph AbstractFramework
    AC["AbstractCore"] -. "capability plugin" .-> VM
    AR["AbstractRuntime"] -. "optional ArtifactStore" .-> VM
  end

The shipped AbstractCore integration is via the capability plugin above. The abstractvoice REPL is a demonstrator/smoke-test harness (see docs/repl_guide.md) and includes a minimal OpenAI-compatible LLM HTTP client (abstractvoice/examples/llm_provider.py) for convenience.

Use with AbstractCore

Install AbstractVoice into the same environment as AbstractCore:

pip install abstractcore abstractvoice

AbstractCore will discover AbstractVoice via the abstractcore.capabilities_plugins entry point and use it as a voice backend. For the current AbstractCore surface (e.g. llm.voice.tts(...) / llm.audio.transcribe(...)), refer to the AbstractCore docs: https://abstractcore.ai and https://github.com/lpalbou/abstractcore.

Use with AbstractFramework

If you’re using the full AbstractFramework stack, install and run via the umbrella project and gateway tooling. Start here: https://github.com/lpalbou/abstractframework.


Install

Requires Python >=3.10 (see pyproject.toml).

pip install abstractvoice

Optional extras (feature flags):

pip install "abstractvoice[all]"

Notes:

  • abstractvoice[all] enables most optional features (incl. cloning + AEC + audio-fx), but does not include the GPU-heavy Chroma runtime, AudioDiT, or OmniVoice.
  • For the full list of extras (and platform troubleshooting), see docs/installation.md.

Explicit model downloads (recommended; never implicit in the REPL)

Some features rely on large model weights/artifacts. AbstractVoice will not download these implicitly inside the REPL (offline-first).

After installing, prefetch explicitly (cross-platform).

Recommended (most users):

abstractvoice-prefetch --piper en
abstractvoice-prefetch --stt small

Optional (voice cloning artifacts):

pip install "abstractvoice[cloning]"
abstractvoice-prefetch --openf5

# Heavy (torch/transformers):
pip install "abstractvoice[audiodit]"
abstractvoice-prefetch --audiodit

pip install "abstractvoice[omnivoice]"
abstractvoice-prefetch --omnivoice

# GPU-heavy:
pip install "abstractvoice[chroma]"
abstractvoice-prefetch --chroma

Equivalent python -m form:

python -m abstractvoice download --piper en
python -m abstractvoice download --stt small
python -m abstractvoice download --openf5   # optional; requires abstractvoice[cloning]
python -m abstractvoice download --chroma   # optional; requires abstractvoice[chroma] (GPU-heavy)
python -m abstractvoice download --audiodit # optional; requires abstractvoice[audiodit]
python -m abstractvoice download --omnivoice # optional; requires abstractvoice[omnivoice]

Notes:

  • --piper <lang> downloads the Piper ONNX voice for that language into ~/.piper/models.
  • --openf5 is ~5.4GB. --chroma is very large (GPU-heavy).

Quick smoke tests

REPL (fastest end-to-end)

abstractvoice --verbose
# or (from a source checkout):
python -m abstractvoice cli --verbose

Notes:

  • Mic voice input is off by default for fast startup. Enable with --voice-mode stop (or in-session: /voice stop).
  • The REPL is offline-first: no implicit model downloads. Use the explicit download commands above.
  • The REPL is primarily a demonstrator. For production agent/server use in the AbstractFramework ecosystem, run AbstractCore and use AbstractVoice via its capability plugin (see docs/api.md → “Integrations”).

See docs/repl_guide.md.

Minimal Python

from abstractvoice import VoiceManager

vm = VoiceManager()
vm.speak("Hello! This is AbstractVoice.")

Public API (stable surface)

See docs/api.md for the supported integrator contract.

At a glance:

  • TTS: speak(), stop_speaking(), pause_speaking(), resume_speaking(), speak_to_bytes(), speak_to_file()
  • STT: transcribe_file(), transcribe_from_bytes()
  • Mic: listen(), stop_listening(), pause_listening(), resume_listening()

Documentation (minimal set)

  • Docs index: docs/README.md
  • Getting started: docs/getting-started.md
  • FAQ: docs/faq.md
  • Orientation: docs/overview.md
  • Acronyms: docs/acronyms.md
  • Public API: docs/api.md
  • REPL guide: docs/repl_guide.md
  • Install troubleshooting: docs/installation.md
  • Multilingual support: docs/multilingual.md
  • Architecture (internal): docs/architecture.md + docs/adr/
  • Model management (Piper-first): docs/model-management.md
  • Licensing notes: docs/voices-and-licenses.md

Project

  • Changelog: CHANGELOG.md
  • Contributing: CONTRIBUTING.md
  • Security: SECURITY.md
  • Acknowledgments: ACKNOWLEDGMENTS.md

License

MIT. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

abstractvoice-0.8.0.tar.gz (212.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

abstractvoice-0.8.0-py3-none-any.whl (219.6 kB view details)

Uploaded Python 3

File details

Details for the file abstractvoice-0.8.0.tar.gz.

File metadata

  • Download URL: abstractvoice-0.8.0.tar.gz
  • Upload date:
  • Size: 212.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for abstractvoice-0.8.0.tar.gz
Algorithm Hash digest
SHA256 2b4800612e0a8214c84251d1d73e57c04954f2f66fd815b8c2ed73720a950e49
MD5 c0b708eeeb3a3423c4125ed8d40f4eb9
BLAKE2b-256 e2209917eae4ee6adcf852c352bc97f61576d42d0536b446451d768e0e5e5325

See more details on using hashes here.

File details

Details for the file abstractvoice-0.8.0-py3-none-any.whl.

File metadata

  • Download URL: abstractvoice-0.8.0-py3-none-any.whl
  • Upload date:
  • Size: 219.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for abstractvoice-0.8.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b00aacd8e72bf11501a102b335a14f129d3f3d83d0c370e9e5bb20eb9f04d285
MD5 93a81da2ea68ba8793e992f921164ead
BLAKE2b-256 a433623d009b437c0ec459a56775bca41121c6527af4143e3df9a4e7f956e784

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page