Convert Fountain-format screenplays to audioplays using local TTS models
Project description
DrinkingFountain
Convert Fountain-format screenplays to audio plays using local TTS models
DrinkingFountain is a command-line tool that transforms Fountain screenplay files into fully narrated audio productions. It uses Piper TTS for high-quality, offline text-to-speech synthesis, giving you complete control over voice selection, timing, and audio output—all processed locally on your machine.
Key Features
- Local TTS: No cloud services required—everything runs on your computer
- Fountain Format: Full support for the standard screenplay format (fountain.io)
- Configurable Voices: Assign specific voices to characters via YAML config or CLI
- Flexible Timing: Adjustable pauses between lines, scenes, and headings
- Audio Control: Sample rate, channel configuration, and loudness normalization
- Voice Management: List, download, and test voice models from HuggingFace
- Smart Chunking: Automatic handling of long dialogue lines
- Multiple Output Formats: Export to WAV or MP3 (requires ffmpeg)
- Direct Playback: Play audio directly through the system's default audio device (requires simpleaudio)
Installation
Prerequisites
- Python: 3.10 or newer
- Package manager:
uv(recommended) orpip - ffmpeg: Required for MP3 export (optional if you only need WAV)
- simpleaudio: Required for audio playback through speakers
Installing ffmpeg
- macOS:
brew install ffmpeg - Linux:
sudo apt-get install ffmpeg(Debian/Ubuntu) orsudo dnf install ffmpeg(Fedora) - Windows: Download from ffmpeg.org and add to PATH
Install DrinkingFountain
Using uv (recommended):
uv sync
Using pip:
pip install -e .
Download Voice Models
At least one voice model is required. Download your first voice:
drinkingfountain voices download en_US-amy-medium
See Voice Models for more options.
Quick Start
- Create a Fountain script (e.g.,
script.fountain):
INT. COFFEE SHOP - DAY
JOHN
(sipping coffee)
This is pretty good.
SARAH
I know, right? The new blend is amazing.
JOHN
We should come here more often.
- Render to audio:
Option A: Save to a file (works without simpleaudio):
drinkingfountain render script.fountain -o output.wav
Option B: Play through speakers (requires simpleaudio):
drinkingfountain render script.fountain
That's it! For more control, read on.
Configuration
DrinkingFountain looks for configuration files in this order:
- Path specified with
--configoption ./drinkingfountain.yaml(current directory)~/.config/drinkingfountain/config.yaml(user config)- If none found, defaults are used
Example Configuration
Create drinkingfountain.yaml:
# TTS backend to use (currently only "piper" is implemented)
backend: piper
# Audio output settings
audio:
sample_rate: 22050 # 22050 or 44100 Hz
channels: mono # "mono" or "stereo"
normalize: true # Normalize loudness
target_level: -3.0 # Target dBFS (negative value)
# Timing and pauses (in seconds)
timing:
pause_between_lines: 0.3 # Pause after each dialogue line
pause_after_scene_heading: 1.0 # Pause after scene heading
pause_between_scenes: 2.0 # Pause when entering new scene
# Voice management settings
voice_management:
bulk_download_language: en_US
bulk_download_quality: medium
max_concurrent_downloads: 3
# Character voice assignments
# Map character names (exactly as in script) to voice IDs
voices:
JOHN: en_US-john-medium
SARAH: en_US-sarah-medium
NARRATOR: en_US-amy-medium
# Prosody adjustments for parenthetical cues
# (Note: Not yet implemented—planned for future release)
prosody:
(whispering):
speed: 0.8
pitch: 0.9
volume: 0.6
(shouting):
speed: 1.2
pitch: 1.3
volume: 1.4
Voice Mapping
The voices section lets you assign specific Piper voice models to characters. Character names must match exactly as they appear in the Fountain script (case-sensitive).
Overrides: Explicit voice assignments always take precedence and work exactly as before.
Auto-assignment: For characters without explicit mapping:
- A voice is randomly selected from the available voices (excluding the narrator voice if one is configured)
- The selection is cached per character and reused consistently across all scenes
- This ensures character voice consistency throughout the production
Narrator handling: If you have a NARRATOR role in your script:
- The narrator's voice is reserved and will never be auto-assigned to any character
- You must explicitly assign a voice to
NARRATORin the config if you want narration - If only one voice is available and a narrator is detected, the narrator role is automatically disabled to avoid conflicts
Default voice: If you set a default voice via VoiceManager.set_default_voice(), it will be used for any character without explicit mapping, provided no other voices are available.
Audio Settings
- sample_rate: Higher values mean better quality but larger files. 22050 Hz is sufficient for speech; use 44100 Hz for music or higher fidelity.
- channels: Mono uses half the storage of stereo and is perfectly fine for voice-only content.
- normalize: Ensures consistent loudness throughout the output. Recommended:
true. - target_level: Normalization target in dBFS. -3.0 dB is a safe, broadcast-compliant level.
Timing Settings
Fine-tune the pacing of your audio production:
- pause_between_lines: Gap between consecutive dialogue lines (default: 0.3s)
- pause_after_scene_heading: Silence after a scene heading before first dialogue (default: 1.0s)
- pause_between_scenes: Extra pause when transitioning between scenes (default: 2.0s)
All timing values are in seconds and can be fractional (e.g., 0.25).
Voice Management
DrinkingFountain includes advanced voice management features that ensure consistent character voices across your entire production and simplify voice model management.
Consistent Voice Assignment
Characters now maintain the same voice across all scenes in a render. When a voice is assigned to a character (either via explicit mapping or auto-assignment), that choice is cached and reused consistently throughout the entire script. This creates a more professional and coherent listening experience, as characters don't suddenly sound different when they appear in later scenes.
Narrator Voice Isolation
The narrator's voice is automatically reserved and will never be auto-assigned to any character. This ensures that if you have a NARRATOR role in your script, its voice assignment remains exclusively yours to configure. The narrator voice is completely excluded from the pool of available voices during character auto-assignment.
Voice Caching
Voice assignments are cached per character during a render. This means:
- The first time a character appears, a voice is selected (either from explicit mapping or randomly from available voices)
- That same voice is used for all subsequent appearances of that character
- The cache is cleared between renders, allowing you to change assignments for next render
This caching happens transparently and doesn't require any configuration.
Bulk Voice Download
Download multiple voice models efficiently with the new bulk download command.
Command: drinkingfountain voices download-bulk
Downloads all available voice models for a specific language and quality from the Piper catalog.
drinkingfountain voices download-bulk [OPTIONS]
Options:
-l, --language CODE: Language code (e.g.,en_US,fr_FR). Required.-q, --quality {x-low,low,medium,high,x-high}: Quality level. Default:medium.-w, --max-workers N: Maximum concurrent downloads. Default:3.--stop-on-error: Stop if any download fails (default: continue on error)--voices-dir PATH: Directory to store voices (overrides default)
Configuration defaults:
You can set default values in .drinkingfountain.yaml to avoid repeating options:
voice_management:
bulk_download_language: en_US
bulk_download_quality: medium
max_concurrent_downloads: 3
With these defaults, you can simply run drinkingfountain voices download-bulk without options.
Examples:
Download all English (US) voices at medium quality:
drinkingfountain voices download-bulk --language en_US --quality medium
Download French voices with 5 concurrent workers, stopping on errors:
drinkingfountain voices download-bulk -l fr_FR -w 5 --stop-on-error
Use config defaults (if set in .drinkingfountain.yaml):
drinkingfountain voices download-bulk
What it does: This command queries the Piper voice catalog, filters by the specified language and quality, and downloads all matching voice models in parallel. It's useful for setting up a complete voice library for a particular language or quality tier.
Backward Compatibility
All new voice management features are fully backward compatible:
- Existing configuration files work unchanged
- Voice assignment overrides continue to function as before
- The narrator isolation and caching are automatic—no configuration needed
- Bulk download is an optional CLI command, not required for normal operation
You can adopt these features gradually without disrupting your existing workflow.
CLI Reference
drinkingfountain render
Render a Fountain script to audio.
drinkingfountain render SCRIPT [OPTIONS]
Arguments:
SCRIPT: Path to the Fountain file (required)
Options:
-o, --output PATH: Output audio file path (optional). Format determined by extension (.wavor.mp3). If omitted, audio plays through the default audio device.--config PATH: Configuration file path--voices-dir PATH: Directory containing voice models (overrides default)--cache-dir PATH: TTS cache directory (caches synthesized audio to speed up re-runs)--verbose: Enable debug logging
Examples:
Save to a WAV file:
drinkingfountain render myscript.fountain -o output.wav
Play through speakers:
drinkingfountain render myscript.fountain
Save to MP3 (requires ffmpeg):
drinkingfountain render myscript.fountain -o output.mp3 --cache-dir .cache
drinkingfountain voices
Manage voice models.
drinkingfountain voices list
List all installed voice models.
drinkingfountain voices list [--voices-dir PATH]
Example output:
Available voices (3):
en_US-amy-medium
en_US-john-high
en_US-sarah-low
drinkingfountain voices available
List voice models available for download from Piper (not yet installed).
drinkingfountain voices available [OPTIONS]
Options:
--format {list,json}: Output format.listshows a simple list (default).jsonshows detailed metadata.--language CODE: Filter by language code (e.g.,en_US,fr_FR)
Example output (list format):
Available voices for download (3):
en_US-amy-medium
en_US-john-medium
fr_FR-henri-medium
Example output (JSON format):
[
{
"id": "en_US-amy-medium",
"language": "en_US",
"quality": "medium",
"dataset": "libritts"
}
]
drinkingfountain voices download
Download a voice model from HuggingFace.
drinkingfountain voices download VOICE_ID [--voices-dir PATH]
Voice ID format: {language}-{name}-{quality}
Examples:
drinkingfountain voices download en_US-amy-medium
drinkingfountain voices download en_GB-james-high
drinkingfountain voices download fr_FR-henri-medium
drinkingfountain voices download-bulk
Download all voice models for a specific language and quality from the Piper catalog.
drinkingfountain voices download-bulk [OPTIONS]
Options:
-l, --language CODE: Language code (e.g.,en_US,fr_FR). Required.-q, --quality {x-low,low,medium,high,x-high}: Quality level. Default:medium.-w, --max-workers N: Maximum concurrent downloads. Default:3.--stop-on-error: Stop if any download fails (default: continue on error)--voices-dir PATH: Directory to store voices (overrides default)
Configuration defaults:
Set defaults in .drinkingfountain.yaml:
voice_management:
bulk_download_language: en_US
bulk_download_quality: medium
max_concurrent_downloads: 3
Examples:
Download all English (US) voices at medium quality:
drinkingfountain voices download-bulk --language en_US --quality medium
Download French voices with 5 concurrent workers, stopping on errors:
drinkingfountain voices download-bulk -l fr_FR -w 5 --stop-on-error
Use config defaults (if set):
drinkingfountain voices download-bulk
drinkingfountain voices test
Generate sample audio with a voice.
drinkingfountain voices test VOICE_ID TEXT [--voices-dir PATH] [--output PATH]
Examples:
# Play through speakers (if simpleaudio installed)
drinkingfountain voices test en_US-amy-medium "Hello, this is a test."
# Save to file
drinkingfountain voices test en_US-amy-medium "Testing voice quality." -o test.wav
Fountain Format
DrinkingFountain supports the Fountain screenplay format—a plain-text format for writing screenplays. Fountain is human-readable, version-control friendly, and widely used in the film industry.
Basic Elements
- Scene headings:
INT. LOCATION - DAYorEXT. LOCATION - NIGHT - Character names: All caps on their own line
- Dialogue: Lines following a character name
- Parentheticals:
(text)on line between character and dialogue - Action: Any other text (descriptions, etc.)
Example Script
FADE IN:
INT. COFFEE SHOP - DAY
A cozy corner table. JOHN (30s, tired) sips his coffee.
JOHN
This is the third cup today.
SARAH (O.S.)
You have a problem.
JOHN
(looking up)
Says who?
SARAH enters, carrying a stack of books.
SARAH
Anyone with eyes.
They both laugh as the CAMERA PANS to the rain outside.
CUT TO:
EXT. STREET - NIGHT
The rain continues. Heavy.
FADE OUT.
Note: DrinkingFountain currently processes dialogue and scene headings. Action lines and transitions are included in the script structure but not spoken (they could be enabled via future configuration).
Voice Models
Where to Find Voices
Piper voice models are hosted on HuggingFace. The official repository is: https://huggingface.co/rhasspy/piper-voices
Browse available voices by language, speaker, and quality.
Naming Convention
Voice IDs follow the pattern:
{LANGUAGE}-{NAME}-{QUALITY}
- LANGUAGE:
en_US,en_GB,fr_FR,de_DE, etc. (language + region) - NAME: Speaker name (e.g.,
amy,john,sarah) - QUALITY: One of:
x-low,low,medium,high,x-high
Examples:
en_US-amy-medium(American English, medium quality)en_GB-james-high(British English, high quality)fr_FR-henri-medium(French, medium quality)
Quality Levels
- x-low: Smallest file size, lowest quality (not recommended)
- low: Small, decent quality
- medium: Good balance of quality and size (default choice)
- high: Larger, high quality
- x-high: Largest, best quality
Recommendation: Start with medium quality. If you need higher fidelity and have disk space, try high.
Listing and Downloading
List installed voices:
drinkingfountain voices list
List all voices available for download from Piper:
drinkingfountain voices available
Filter by language:
drinkingfountain voices available --language en_US
Get detailed information in JSON format:
drinkingfountain voices available --format json
Download a voice:
drinkingfountain voices download en_US-amy-medium
Voices are stored in the Piper default directory:
- Linux/macOS:
~/.local/share/piper-tts/voices/ - Windows:
%APPDATA%/local/share/piper-tts/voices/
Override with --voices-dir if you want a custom location.
Troubleshooting
"No voices available" or "Voice model not found"
Solution: Download at least one voice model:
drinkingfountain voices download en_US-amy-medium
MP3 export fails with "ffmpeg not found"
Solution: Install ffmpeg (see Prerequisites). Alternatively, export to WAV:
drinkingfountain render script.fountain -o output.wav
Long dialogue lines get cut off or produce errors
Explanation: Piper TTS has a maximum text length (typically ~500 characters). DrinkingFountain automatically chunks long dialogue into smaller pieces and concatenates the audio with short pauses.
No action needed—this is handled transparently. If you encounter issues, ensure you're using the latest version.
Poor audio quality or robotic voice
Possible causes:
- Voice model quality is too low (try
highorx-high) - Voice model is corrupted or incomplete (re-download)
- Sample rate mismatch (use 22050 Hz for most Piper voices)
Solutions:
- Try a different voice:
drinkingfountain voices download en_US-amy-high - Check your audio settings:
sample_rate: 22050is recommended for Piper - Verify the voice file exists:
ls ~/.local/share/piper-tts/voices/en_US-amy-medium.onnx
"No dialogue found in script"
Cause: The Fountain file may not have properly formatted dialogue (character names not in ALL CAPS, missing blank lines).
Solution: Ensure your script follows Fountain conventions:
- Character names on their own line, in ALL CAPS
- Blank line before character name
- Dialogue lines directly after character
Audio is too quiet or too loud
Solution: Adjust normalization settings in config:
audio:
normalize: true
target_level: -3.0 # Try -6.0 for quieter, -1.0 for louder
Or disable normalization and adjust manually in post.
Known Limitations
Not Yet Implemented
- Prosody from parentheticals: Parenthetical cues like
(whispering)or(shouting)are parsed but not yet applied to TTS output. This is planned for a future release. - Dual dialogue: Simultaneous dialogue (two characters speaking at once using
^notation) is not supported. Lines are processed sequentially. - Non-dialogue speech: Action lines, transitions, and other non-dialogue elements are not synthesized. Only scene headings (if configured) and dialogue are included in the audio output.
- GUI: DrinkingFountain is CLI-only. No graphical interface is currently planned, but the CLI is designed to be scriptable.
Platform-Specific Notes
- Windows: Voice download may require additional permissions or manual download from HuggingFace if subprocess calls fail.
- ARM/Mac Silicon: Piper TTS works natively on Apple Silicon. No Rosetta needed.
- GPU acceleration: Not currently used—all synthesis runs on CPU.
Voice Model Availability
- Piper voice models are limited to what's available on HuggingFace. Not all languages/speakers are supported.
- Voice quality varies by language. English voices are most abundant and highest quality.
Development
Running Tests
Using uv:
uv run pytest
Using pytest directly:
pytest
Run with coverage:
pytest --cov=src/drinkingfountain
Pre-commit Hooks
Install pre-commit hooks to enforce code quality:
pre-commit install
This runs Ruff formatting and linting on staged files.
Project Structure
drinkingfountain/
├── src/drinkingfountain/
│ ├── __init__.py
│ ├── cli.py # Command-line interface (Click)
│ ├── audio/
│ │ ├── mixer.py # Audio mixing, pauses, normalization
│ │ └── __init__.py
│ ├── config/
│ │ ├── settings.py # Configuration dataclasses
│ │ └── __init__.py
│ ├── parser/
│ │ ├── fountain.py # Fountain format parser
│ │ ├── script.py # Script data structures
│ │ └── __init__.py
│ ├── tts/
│ │ ├── base.py # TTS backend interface
│ │ ├── piper.py # Piper TTS implementation
│ │ ├── cache.py # Caching wrapper
│ │ └── __init__.py
│ ├── utils/
│ │ ├── text_chunker.py # Long text splitting
│ │ └── __init__.py
│ └── voices/
│ ├── manager.py # Voice assignment logic
│ └── __init__.py
├── tests/ # Test suite
├── pyproject.toml # Project metadata and dependencies
├── .pre-commit-config.yaml # Pre-commit configuration
└── README.md # This file
Architecture Overview
- CLI (
cli.py): Entry point, parses arguments, orchestrates the pipeline - Parser (
parser/fountain.py): Reads Fountain files intoScriptobjects - Config (
config/settings.py): Loads YAML configuration with validation - VoiceManager (
voices/manager.py): Maps characters to voice IDs - TTS Backend (
tts/piper.py): Generates audio via Piper, handles chunking - AudioMixer (
audio/mixer.py): Combines segments, adds pauses, normalizes, exports
Adding New TTS Backends
The TTSBackend abstract base class (in tts/base.py) defines the interface:
class TTSBackend(Protocol):
def is_available(self) -> bool: ...
def list_voices(self) -> list[str]: ...
def download_voice(self, voice: str, target_dir: Path | None) -> None: ...
def generate_audio(self, text: str, voice: str) -> AudioSegment: ...
Implement this protocol to add support for Coqui TTS, Transformers, or cloud services.
License
MIT License. See pyproject.toml for details.
Getting Help
- Bug reports: Open an issue on the project repository
- Questions: Check the Fountain spec for script formatting questions
- Piper TTS: See Piper documentation
Happy scripting, and may your table reads be ever in tune!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file drinkingfountain-0.1.0.tar.gz.
File metadata
- Download URL: drinkingfountain-0.1.0.tar.gz
- Upload date:
- Size: 144.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
59deec483db7b7ca76cb3ddf953f6346d4d858a3b4bafb64af0a68a05f68ffa9
|
|
| MD5 |
81f327799ee8c4867487e04c937684ee
|
|
| BLAKE2b-256 |
e1ab768ef6c19e669ca4fe1151b63b2286a99968bf3b5e9e2b66a974849a3e40
|
File details
Details for the file drinkingfountain-0.1.0-py3-none-any.whl.
File metadata
- Download URL: drinkingfountain-0.1.0-py3-none-any.whl
- Upload date:
- Size: 84.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dd46abef41647143e8a81e521f9b63ce1972c53eca7c730ec304b97a30a128c8
|
|
| MD5 |
5a951f50bfab0c660454f0ae96570a95
|
|
| BLAKE2b-256 |
f1f85c3c6af52b6ce0aff269bd2578c9a2accf137ab7cda98b9571a3c4f15a6b
|