Skip to main content

Ingest text (txt/md/pdf/docx/images), translate, and synthesize audio.

Project description

plycast

Python PyPI License: MIT Open Source Version

Convert books into multilingual audiobooks with context-aware LLM translation and text-to-speech. Turn PDF, DOCX, images, or text → translate → generate audio (MP3) in one pipeline.

🚀 At a glance

  • 📚 Convert documents into translated audiobooks
  • 🌍 Multilingual support with LLM-based translation
  • 🧠 More natural, context-aware wording (better for listening)
  • 🔌 Open-source, local-first, BYOK-friendly

Table of contents

About

What it’s for

  • One workflow from a file on disk to translated text and audio—no need to chain many tools by hand.
  • Fits drafts, articles, chapters, or scans you’d rather listen to in another language.

How you use it

  • Terminal: run the plycast command for quick jobs.
plycast tts examples/input/nhat-ky-nu-phap-y-c1.txt \
  --lang vi --tts system_say --voice Linh
  • Python: after import plycast use it in Python scripts
from pathlib import Path
from plycast import synthesize_book

r = synthesize_book(
    Path("examples/input/nhat-ky-nu-phap-y-c1.txt"),
    tts_language="vi",
    tts="system_say",
    voice="Linh",
    audio_format="mp3",
)
print(r.audio_path)

Inputs

  • Text and Markdown out of the box.
  • PDF, Word, and images (e.g. photos of text) with a little extra setup.

Translation (pick what matches you)

  • LibreTranslate (hosted or self-hosted): straightforward; good if you want to edit the text before audio.
  • Large language models: geared toward natural, listenable wording when you’re fine using a vendor API and key.

Speech

  • Uses voices already on your system where possible (e.g. Mac say, espeak-style on Linux).
  • Optional neural voices if you install that stack.

Installation

Published package: plycast on PyPI.

Prerequisites

Before pip install, install anything below that matches what you will use. After installing system tools, open a new terminal (or restart the IDE) so PATH updates.

  • Python 3.10+ (3.11+ recommended)

    • Where: python.org/downloads, your OS package manager, or a version manager (pyenv, conda, etc.).
    • Check: python3 --version (or python --version on Windows).
  • Virtual environment (recommended)

    • How: python3 -m venv .venv then source .venv/bin/activate (macOS/Linux) or .venv\Scripts\activate (Windows).
    • Why: keeps pip install plycast isolated from system Python.
  • Tesseract (only if you use image inputs for OCR)

    • What: the tesseract binary on your PATH (plycast installs pytesseract via pip, not the engine).
    • macOS (Homebrew): brew install tesseract tesseract-lang
    • Debian / Ubuntu: sudo apt install tesseract-ocr (add language packs, e.g. tesseract-ocr-chi-sim, as needed)
    • Windows: installer from the Tesseract at UB Mannheim wiki, or Chocolatey / winget packages named tesseract.
    • Check: tesseract --version
  • ffmpeg (only if you want mp3 / m4a / aiff after WAV-based TTS)

    • macOS (Homebrew): brew install ffmpeg
    • Debian / Ubuntu: sudo apt install ffmpeg
    • Windows: winget install ffmpeg or ffmpeg.org builds; ensure ffmpeg is on PATH.
    • Check: ffmpeg -version
  • Speech backends (pick what matches --tts):

    • system_say: macOS only — uses the built-in say command; nothing to install.
    • espeak: install espeak-ng (or legacy espeak) on your PATH.
      • Debian / Ubuntu: sudo apt install espeak-ng
      • macOS (Homebrew): brew install espeak
      • Check: espeak-ng --version or espeak --version
    • text_file: no audio engine.
  • Translation / APIs

More env vars, CLI flags, and troubleshooting → docs/QuickStart.md.

Install with pip

  1. Activate your virtual environment (see above).

  2. Install the library and CLI:

    pip install plycast
    

    This pulls in core Python packages (Typer, Rich, Pillow, pytesseract, …). System tools (Tesseract, ffmpeg, etc.) stay separate—see Prerequisites.

  3. Optional extras (pick what you need):

    Extra Command Purpose
    PDF / Word pip install 'plycast[docs]' pypdf, python-docx for .pdf / .docx
    Parler TTS pip install 'plycast[parler]' Neural TTS (Parler-TTS, PyTorch, transformers, soundfile, … — large download). English-centric; see Voices.md.
    Docs + PDF/Word pip install 'plycast[full]' Same as [docs] today
    Development pip install 'plycast[dev]' pytest (for contributors)

    Parler troubleshooting: if soundfile fails to install, install OS libsndfile first (macOS: brew install libsndfile; Debian/Ubuntu: sudo apt install libsndfile1), then re-run pip install 'plycast[parler]'. mp3 / m4a output still needs ffmpeg (see Prerequisites).

  4. From a git clone (editable install):

    pip install -e ".[dev]"
    

    Add [docs] or [parler] as needed, e.g. pip install -e ".[dev,docs]".

Features

  • Input: .txt, .md; optional .pdf / .docx (pip install ".[docs]"); images with OCR and --source-lang for Tesseract languages (e.g. zh).
  • Translation: chunked text; LLM path for natural tone; LibreTranslate for a free, self-hostable draft you can review and edit; Identity, OpenAI, Anthropic, unified LLM routing (LLMTranslator).
  • Audio: SystemSayTTS (macOS say); ParlerTTS / --tts parler (neural TTS, English-only for reliable quality; optional pip install 'plycast[parler]', default on non-macOS when installed; seed voices + --parler-gender, or raw --voice); EspeakTTS / --tts espeak (espeak-ng, fallback when Parler is not installed); TextFileTTS / --tts text_file anywhere. See docs/Voices.md.
  • CLI: flags for translator, languages, API keys, TTS, chunk size — docs/CLI.md (reference) and docs/QuickStart.md (tutorials).

Documentation

Doc Contents
docs/QuickStart.md Install, prerequisites, CLI intro, LibreTranslate Docker, LLM examples, Python API, env vars, troubleshooting
docs/CLI.md CLI reference: all commands and options (convert, translate, tts, inspect), outputs, translators, TTS backends, --json
docs/Voices.md system_say (macOS) and espeak (Linux-friendly) voices / --voice
examples/README.md Vietnamese sample: vi → en + Parler (laura) and vi + system_say (Linh) — CLI + Python
This README Project introduction and layout

Contributing

A contribution workflow (guidelines, review process, and what we merge) is in progress and not finalized yet — so we are not ready for pull requests or formal code contributions at this time.

Issues are welcome: please open an issue on GitHub for bugs, ideas, or questions.

License

Open-source. This project is released under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

plycast-0.1.2.tar.gz (33.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

plycast-0.1.2-py3-none-any.whl (43.6 kB view details)

Uploaded Python 3

File details

Details for the file plycast-0.1.2.tar.gz.

File metadata

  • Download URL: plycast-0.1.2.tar.gz
  • Upload date:
  • Size: 33.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for plycast-0.1.2.tar.gz
Algorithm Hash digest
SHA256 7a68437963d969f40a5d72584c9ba29f7eaae18877e51de33fe908682e34a019
MD5 c9efec82b4d2868f517290254c12438c
BLAKE2b-256 04a71f410ab555f0594b9c42499818daf96e3900a410fa2396a2e761fb14439d

See more details on using hashes here.

Provenance

The following attestation bundles were made for plycast-0.1.2.tar.gz:

Publisher: pypi-publish.yml on latoi-hub/plycast

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file plycast-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: plycast-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 43.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for plycast-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 6eac63dedbb10f4e52365022d558f4a77d21ecd1aad0f4170b5729f63c6a3bc3
MD5 d9fded78a723bac601df90b2b76933ab
BLAKE2b-256 ccf09ecad47c2785d4b574556bad1d0925fadeadcd874a575b16652d91e8291d

See more details on using hashes here.

Provenance

The following attestation bundles were made for plycast-0.1.2-py3-none-any.whl:

Publisher: pypi-publish.yml on latoi-hub/plycast

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page