Skip to main content

VoxCPM TTS model with Apple Neural Engine backend server

Project description

VoxCPMANE

VoxCPM TTS model with Apple Neural Engine (ANE) backend server. CoreML models available in Huggingface repository.

  • 🎤 Voice Cloning: Support for custom voice prompts and cached voices
  • 📡 Streaming Support: Real-time audio streaming for low latency
  • 🎧 Server-side Playback: Direct audio playback on the server
  • 🌐 Web Interface: Interactive playground for testing

Voice Cloning

https://github.com/user-attachments/assets/02ffa400-b2fd-422e-a3ad-a0ea232a55aa

Included Voices Listen samples

https://github.com/user-attachments/assets/28880ed2-2e21-4eb4-b0ce-18a100403e87

Installation

Prerequisites

  • macOS with Apple Silicon for ANE acceleration
  • Python 3.9-3.12
  • uv package manager (recommended)
  • pydub required for audio formats other than wav in /speech endpoint

Install with pip or uv

uv pip install voxcpmane
pip install voxcpmane

The server will start on http://localhost:8000 by default. You can access the web playground at the root URL.

Configuration

Command Line Options

uv run voxcpmane-server --help
  • --host: Host to bind the server to (default: 0.0.0.0)
  • --port: Port to run the server on (default: 8000)
  • --cache-dir: Directory for custom voice caches (default: ~/.cache/ane_tts)

Custom Voices

You can create reusable cached voices in two ways:

  1. Via the Web Playground/API: Use the "Create Voice" tab or POST /v1/voices endpoint.
  2. Startup Compilation: Place pairs of audio files (e.g., .wav, .mp3) and transcriptions (.txt) in the custom cache directory. The server will automatically compile them into voice caches (.npy) on startup.

Example: If you place myvoice.mp3 and myvoice.txt in the cache directory, the server will generate myvoice.npy on start, making "myvoice" available for generation.

API Reference

The full API documentation is available in docs/API.md.

Changelog

Version 0.0.3

  • Added support for creation of custom voices

Roadmap

  • Automatic prompt caching
  • Chunked long audio generation
  • Custom voices

Acknowledgments

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

voxcpmane-0.0.3.tar.gz (36.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

voxcpmane-0.0.3-py3-none-any.whl (33.5 kB view details)

Uploaded Python 3

File details

Details for the file voxcpmane-0.0.3.tar.gz.

File metadata

  • Download URL: voxcpmane-0.0.3.tar.gz
  • Upload date:
  • Size: 36.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.7

File hashes

Hashes for voxcpmane-0.0.3.tar.gz
Algorithm Hash digest
SHA256 dd3d51b33326950838b3ceef0b72a88469826eb6daf96879ff80c849a6348584
MD5 1ef3a3ced9d8f9bccf98a91b142c7e30
BLAKE2b-256 b6613dc748c6de91197b0b1b6d1cb4ea9c7fcc7d7170153d4bb324500194fa7f

See more details on using hashes here.

File details

Details for the file voxcpmane-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: voxcpmane-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 33.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.7

File hashes

Hashes for voxcpmane-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 a4c5a7a7076fa2ea24ad7d3f158716024cdb0cf187884585b4ed1dd44cbdc30f
MD5 44066c9e2d45e56c80b17ba7b2a94e3c
BLAKE2b-256 46b3eaf30388750ddd5014cbf7d30d536d2d6c7f6d8a7f67174cbfda46430265

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page