Ingest text (txt/md/pdf/docx/images), translate, and synthesize audio.
Project description
plycast
At a glance
- Turn stories, notes, and documents into spoken audio in another language—one path from file → translation → sound.
- Open source under the MIT License: use, change, and share freely.
Table of contents
About
What it’s for
- One workflow from a file on disk to translated text and audio—no need to chain many tools by hand.
- Fits drafts, articles, chapters, or scans you’d rather listen to in another language.
How you use it
- Terminal: run the
plycastcommand for quick jobs.
plycast tts examples/input/nhat-ky-nu-phap-y-c1.txt \
--lang vi --tts system_say --voice Linh
- Python: after
import plycastuse it in Python scripts
from pathlib import Path
from plycast import synthesize_book
r = synthesize_book(
Path("examples/input/nhat-ky-nu-phap-y-c1.txt"),
tts_language="vi",
tts="system_say",
voice="Linh",
audio_format="mp3",
)
print(r.audio_path)
Inputs
- Text and Markdown out of the box.
- PDF, Word, and images (e.g. photos of text) with a little extra setup.
Translation (pick what matches you)
- LibreTranslate (hosted or self-hosted): straightforward; good if you want to edit the text before audio.
- Large language models: geared toward natural, listenable wording when you’re fine using a vendor API and key.
Speech
- Uses voices already on your system where possible (e.g. Mac say, espeak-style on Linux).
- Optional neural voices if you install that stack.
Installation
Published package: plycast on PyPI.
Prerequisites
Before pip install, install anything below that matches what you will use. After installing system tools, open a new terminal (or restart the IDE) so PATH updates.
-
Python 3.10+ (3.11+ recommended)
- Where: python.org/downloads, your OS package manager, or a version manager (pyenv, conda, etc.).
- Check:
python3 --version(orpython --versionon Windows).
-
Virtual environment (recommended)
- How:
python3 -m venv .venvthensource .venv/bin/activate(macOS/Linux) or.venv\Scripts\activate(Windows). - Why: keeps
pip install plycastisolated from system Python.
- How:
-
Tesseract (only if you use image inputs for OCR)
- What: the
tesseractbinary on yourPATH(plycast installs pytesseract via pip, not the engine). - macOS (Homebrew):
brew install tesseract tesseract-lang - Debian / Ubuntu:
sudo apt install tesseract-ocr(add language packs, e.g.tesseract-ocr-chi-sim, as needed) - Windows: installer from the Tesseract at UB Mannheim wiki, or Chocolatey / winget packages named
tesseract. - Check:
tesseract --version
- What: the
-
ffmpeg (only if you want mp3 / m4a / aiff after WAV-based TTS)
- macOS (Homebrew):
brew install ffmpeg - Debian / Ubuntu:
sudo apt install ffmpeg - Windows:
winget install ffmpegor ffmpeg.org builds; ensureffmpegis onPATH. - Check:
ffmpeg -version
- macOS (Homebrew):
-
Speech backends (pick what matches
--tts):system_say: macOS only — uses the built-insaycommand; nothing to install.espeak: installespeak-ng(or legacyespeak) on yourPATH.- Debian / Ubuntu:
sudo apt install espeak-ng - macOS (Homebrew):
brew install espeak - Check:
espeak-ng --versionorespeak --version
- Debian / Ubuntu:
text_file: no audio engine.
-
Translation / APIs
- LibreTranslate: run your own server (e.g. Docker — see QuickStart → Self-hosted LibreTranslate) or use a host you trust; optional API key.
- LLM (OpenAI / Anthropic): create a key in the vendor console (OpenAI, Anthropic); export
OPENAI_API_KEYorANTHROPIC_API_KEY(details in QuickStart).
More env vars, CLI flags, and troubleshooting → docs/QuickStart.md.
Install with pip
-
Activate your virtual environment (see above).
-
Install the library and CLI:
pip install plycast
This pulls in core Python packages (Typer, Rich, Pillow, pytesseract, …). System tools (Tesseract, ffmpeg, etc.) stay separate—see Prerequisites.
-
Optional extras (pick what you need):
Extra Command Purpose PDF / Word pip install 'plycast[docs]'pypdf,python-docxfor.pdf/.docxParler TTS pip install 'plycast[parler]'Neural TTS (Parler-TTS, PyTorch, transformers, soundfile, … — large download). English-centric; see Voices.md. Docs + PDF/Word pip install 'plycast[full]'Same as [docs]todayDevelopment pip install 'plycast[dev]'pytest (for contributors) Parler troubleshooting: if
soundfilefails to install, install OS libsndfile first (macOS:brew install libsndfile; Debian/Ubuntu:sudo apt install libsndfile1), then re-runpip install 'plycast[parler]'. mp3 / m4a output still needs ffmpeg (see Prerequisites). -
From a git clone (editable install):
pip install -e ".[dev]"
Add
[docs]or[parler]as needed, e.g.pip install -e ".[dev,docs]".
Features
- Input:
.txt,.md; optional.pdf/.docx(pip install ".[docs]"); images with OCR and--source-langfor Tesseract languages (e.g.zh). - Translation: chunked text; LLM path for natural tone; LibreTranslate for a free, self-hostable draft you can review and edit; Identity, OpenAI, Anthropic, unified LLM routing (
LLMTranslator). - Audio:
SystemSayTTS(macOSsay);ParlerTTS/--tts parler(neural TTS, English-only for reliable quality; optionalpip install 'plycast[parler]', default on non-macOS when installed; seed voices +--parler-gender, or raw--voice);EspeakTTS/--tts espeak(espeak-ng, fallback when Parler is not installed);TextFileTTS/--tts text_fileanywhere. See docs/Voices.md. - CLI: flags for translator, languages, API keys, TTS, chunk size — docs/CLI.md (reference) and docs/QuickStart.md (tutorials).
Documentation
| Doc | Contents |
|---|---|
| docs/QuickStart.md | Install, prerequisites, CLI intro, LibreTranslate Docker, LLM examples, Python API, env vars, troubleshooting |
| docs/CLI.md | CLI reference: all commands and options (convert, translate, tts, inspect), outputs, translators, TTS backends, --json |
| docs/Voices.md | system_say (macOS) and espeak (Linux-friendly) voices / --voice |
| examples/README.md | Vietnamese sample: vi → en + Parler (laura) and vi + system_say (Linh) — CLI + Python |
| This README | Project introduction and layout |
Contributing
A contribution workflow (guidelines, review process, and what we merge) is in progress and not finalized yet — so we are not ready for pull requests or formal code contributions at this time.
Issues are welcome: please open an issue on GitHub for bugs, ideas, or questions.
License
Open-source. This project is released under the MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file plycast-0.1.1.tar.gz.
File metadata
- Download URL: plycast-0.1.1.tar.gz
- Upload date:
- Size: 33.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0e16a40a473deba4c2b1567451d902b2ed6c81f567c1ef9549128f5b704be608
|
|
| MD5 |
495d6771d0a802a8de318d5f170ce1d7
|
|
| BLAKE2b-256 |
85b13f0c0e0729a722e46918e42b5ece82b3279c9a975e1650c4d87e3a5c025b
|
Provenance
The following attestation bundles were made for plycast-0.1.1.tar.gz:
Publisher:
pypi-publish.yml on latoi-hub/plycast
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
plycast-0.1.1.tar.gz -
Subject digest:
0e16a40a473deba4c2b1567451d902b2ed6c81f567c1ef9549128f5b704be608 - Sigstore transparency entry: 1191612304
- Sigstore integration time:
-
Permalink:
latoi-hub/plycast@962e2bdba9658131a6e74321208ecbd2b6e918ef -
Branch / Tag:
refs/heads/main - Owner: https://github.com/latoi-hub
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish.yml@962e2bdba9658131a6e74321208ecbd2b6e918ef -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file plycast-0.1.1-py3-none-any.whl.
File metadata
- Download URL: plycast-0.1.1-py3-none-any.whl
- Upload date:
- Size: 43.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a56cad18b1c0f5e1701ce6f73b20043ca6f5f224865d44cc62375c701dbc47c3
|
|
| MD5 |
30da651674815f6a67db8f2d1b0c4382
|
|
| BLAKE2b-256 |
173e6384c3524e80aca82a1a035de9ff65955ec8b18e8928ec568e7ce20361e6
|
Provenance
The following attestation bundles were made for plycast-0.1.1-py3-none-any.whl:
Publisher:
pypi-publish.yml on latoi-hub/plycast
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
plycast-0.1.1-py3-none-any.whl -
Subject digest:
a56cad18b1c0f5e1701ce6f73b20043ca6f5f224865d44cc62375c701dbc47c3 - Sigstore transparency entry: 1191612310
- Sigstore integration time:
-
Permalink:
latoi-hub/plycast@962e2bdba9658131a6e74321208ecbd2b6e918ef -
Branch / Tag:
refs/heads/main - Owner: https://github.com/latoi-hub
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish.yml@962e2bdba9658131a6e74321208ecbd2b6e918ef -
Trigger Event:
workflow_dispatch
-
Statement type: