Convert Fountain-format screenplays to audioplays using local TTS models

These details have not been verified by PyPI

Project description

DrinkingFountain

Convert Fountain-format screenplays to audio plays using local TTS models

DrinkingFountain is a command-line tool that transforms Fountain screenplay files into fully narrated audio productions. It uses Piper TTS for high-quality, offline text-to-speech synthesis, giving you complete control over voice selection, timing, and audio output—all processed locally on your machine.

Key Features

Local TTS: No cloud services required—everything runs on your computer
Fountain Format: Full support for the standard screenplay format (fountain.io)
Configurable Voices: Assign specific voices to characters via YAML config or CLI
Flexible Timing: Adjustable pauses between lines, scenes, and headings
Audio Control: Sample rate, channel configuration, and loudness normalization
Voice Management: List, download, and test voice models from HuggingFace
Smart Chunking: Automatic handling of long dialogue lines
Multiple Output Formats: Export to WAV or MP3 (requires ffmpeg)
Direct Playback: Play audio directly through the system's default audio device (requires simpleaudio)

Installation

Prerequisites

Python: 3.10 or newer
Package manager: uv (recommended) or pip
ffmpeg: Required for MP3 export (optional if you only need WAV)
simpleaudio: Required for audio playback through speakers

Installing ffmpeg

macOS: brew install ffmpeg
Linux: sudo apt-get install ffmpeg (Debian/Ubuntu) or sudo dnf install ffmpeg (Fedora)
Windows: Download from ffmpeg.org and add to PATH

Install DrinkingFountain

Using uv (recommended):

uv sync

Using pip:

pip install -e .

Download Voice Models

At least one voice model is required. Download your first voice:

drinkingfountain voices download en_US-amy-medium

See Voice Models for more options.

Quick Start

Create a Fountain script (e.g., script.fountain):

INT. COFFEE SHOP - DAY

JOHN
(sipping coffee)
This is pretty good.

SARAH
I know, right? The new blend is amazing.

JOHN
We should come here more often.

Render to audio:

Option A: Save to a file (works without simpleaudio):

drinkingfountain render script.fountain -o output.wav

Option B: Play through speakers (requires simpleaudio):

drinkingfountain render script.fountain

That's it! For more control, read on.

Configuration

DrinkingFountain looks for configuration files in this order:

Path specified with --config option
./drinkingfountain.yaml (current directory)
~/.config/drinkingfountain/config.yaml (user config)
If none found, defaults are used

Example Configuration

Create drinkingfountain.yaml:

# TTS backend to use (currently only "piper" is implemented)
backend: piper

# Audio output settings
audio:
  sample_rate: 22050      # 22050 or 44100 Hz
  channels: mono          # "mono" or "stereo"
  normalize: true         # Normalize loudness
  target_level: -3.0      # Target dBFS (negative value)

# Timing and pauses (in seconds)
timing:
  pause_between_lines: 0.3      # Pause after each dialogue line
  pause_after_scene_heading: 1.0  # Pause after scene heading
  pause_between_scenes: 2.0      # Pause when entering new scene

# Voice management settings
voice_management:
  bulk_download_language: en_US
  bulk_download_quality: medium
  max_concurrent_downloads: 3

# Character voice assignments
# Map character names (exactly as in script) to voice IDs
voices:
  JOHN: en_US-john-medium
  SARAH: en_US-sarah-medium
  NARRATOR: en_US-amy-medium

# Prosody adjustments for parenthetical cues
# (Note: Not yet implemented—planned for future release)
prosody:
  (whispering):
    speed: 0.8
    pitch: 0.9
    volume: 0.6
  (shouting):
    speed: 1.2
    pitch: 1.3
    volume: 1.4

Voice Mapping

The voices section lets you assign specific Piper voice models to characters. Character names must match exactly as they appear in the Fountain script (case-sensitive).

Overrides: Explicit voice assignments always take precedence and work exactly as before.

Auto-assignment: For characters without explicit mapping:

A voice is randomly selected from the available voices (excluding the narrator voice if one is configured)
The selection is cached per character and reused consistently across all scenes
This ensures character voice consistency throughout the production

Narrator handling: If you have a NARRATOR role in your script:

The narrator's voice is reserved and will never be auto-assigned to any character
You must explicitly assign a voice to NARRATOR in the config if you want narration
If only one voice is available and a narrator is detected, the narrator role is automatically disabled to avoid conflicts

Default voice: If you set a default voice via VoiceManager.set_default_voice(), it will be used for any character without explicit mapping, provided no other voices are available.

Audio Settings

sample_rate: Higher values mean better quality but larger files. 22050 Hz is sufficient for speech; use 44100 Hz for music or higher fidelity.
channels: Mono uses half the storage of stereo and is perfectly fine for voice-only content.
normalize: Ensures consistent loudness throughout the output. Recommended: true.
target_level: Normalization target in dBFS. -3.0 dB is a safe, broadcast-compliant level.

Timing Settings

Fine-tune the pacing of your audio production:

pause_between_lines: Gap between consecutive dialogue lines (default: 0.3s)
pause_after_scene_heading: Silence after a scene heading before first dialogue (default: 1.0s)
pause_between_scenes: Extra pause when transitioning between scenes (default: 2.0s)

All timing values are in seconds and can be fractional (e.g., 0.25).

Voice Management

DrinkingFountain includes advanced voice management features that ensure consistent character voices across your entire production and simplify voice model management.

Consistent Voice Assignment

Characters now maintain the same voice across all scenes in a render. When a voice is assigned to a character (either via explicit mapping or auto-assignment), that choice is cached and reused consistently throughout the entire script. This creates a more professional and coherent listening experience, as characters don't suddenly sound different when they appear in later scenes.

Narrator Voice Isolation

The narrator's voice is automatically reserved and will never be auto-assigned to any character. This ensures that if you have a NARRATOR role in your script, its voice assignment remains exclusively yours to configure. The narrator voice is completely excluded from the pool of available voices during character auto-assignment.

Voice Caching

Voice assignments are cached per character during a render. This means:

The first time a character appears, a voice is selected (either from explicit mapping or randomly from available voices)
That same voice is used for all subsequent appearances of that character
The cache is cleared between renders, allowing you to change assignments for next render

This caching happens transparently and doesn't require any configuration.

Bulk Voice Download

Download multiple voice models efficiently with the new bulk download command.

Command: `drinkingfountain voices download-bulk`

Downloads all available voice models for a specific language and quality from the Piper catalog.

drinkingfountain voices download-bulk [OPTIONS]

Options:

-l, --language CODE: Language code (e.g., en_US, fr_FR). Required.
-q, --quality {x-low,low,medium,high,x-high}: Quality level. Default: medium.
-w, --max-workers N: Maximum concurrent downloads. Default: 3.
--stop-on-error: Stop if any download fails (default: continue on error)
--voices-dir PATH: Directory to store voices (overrides default)

Configuration defaults: You can set default values in .drinkingfountain.yaml to avoid repeating options:

voice_management:
  bulk_download_language: en_US
  bulk_download_quality: medium
  max_concurrent_downloads: 3

With these defaults, you can simply run drinkingfountain voices download-bulk without options.

Examples:

Download all English (US) voices at medium quality:

drinkingfountain voices download-bulk --language en_US --quality medium

Download French voices with 5 concurrent workers, stopping on errors:

drinkingfountain voices download-bulk -l fr_FR -w 5 --stop-on-error

Use config defaults (if set in .drinkingfountain.yaml):

drinkingfountain voices download-bulk

What it does: This command queries the Piper voice catalog, filters by the specified language and quality, and downloads all matching voice models in parallel. It's useful for setting up a complete voice library for a particular language or quality tier.

Backward Compatibility

All new voice management features are fully backward compatible:

Existing configuration files work unchanged
Voice assignment overrides continue to function as before
The narrator isolation and caching are automatic—no configuration needed
Bulk download is an optional CLI command, not required for normal operation

You can adopt these features gradually without disrupting your existing workflow.

CLI Reference

`drinkingfountain render`

Render a Fountain script to audio.

drinkingfountain render SCRIPT [OPTIONS]

Arguments:

SCRIPT: Path to the Fountain file (required)

Options:

-o, --output PATH: Output audio file path (optional). Format determined by extension (.wav or .mp3). If omitted, audio plays through the default audio device.
--config PATH: Configuration file path
--voices-dir PATH: Directory containing voice models (overrides default)
--cache-dir PATH: TTS cache directory (caches synthesized audio to speed up re-runs)
--verbose: Enable debug logging

Examples:

Save to a WAV file:

drinkingfountain render myscript.fountain -o output.wav

Play through speakers:

drinkingfountain render myscript.fountain

Save to MP3 (requires ffmpeg):

drinkingfountain render myscript.fountain -o output.mp3 --cache-dir .cache

`drinkingfountain voices`

Manage voice models.

`drinkingfountain voices list`

List all installed voice models.

drinkingfountain voices list [--voices-dir PATH]

Example output:

Available voices (3):
  en_US-amy-medium
  en_US-john-high
  en_US-sarah-low

`drinkingfountain voices available`

List voice models available for download from Piper (not yet installed).

drinkingfountain voices available [OPTIONS]

Options:

--format {list,json}: Output format. list shows a simple list (default). json shows detailed metadata.
--language CODE: Filter by language code (e.g., en_US, fr_FR)

Example output (list format):

Available voices for download (3):
  en_US-amy-medium
  en_US-john-medium
  fr_FR-henri-medium

Example output (JSON format):

[
  {
    "id": "en_US-amy-medium",
    "language": "en_US",
    "quality": "medium",
    "dataset": "libritts"
  }
]

`drinkingfountain voices download`

Download a voice model from HuggingFace.

drinkingfountain voices download VOICE_ID [--voices-dir PATH]

Voice ID format: {language}-{name}-{quality}

Examples:

drinkingfountain voices download en_US-amy-medium
drinkingfountain voices download en_GB-james-high
drinkingfountain voices download fr_FR-henri-medium

`drinkingfountain voices download-bulk`

Download all voice models for a specific language and quality from the Piper catalog.

drinkingfountain voices download-bulk [OPTIONS]

Options:

-l, --language CODE: Language code (e.g., en_US, fr_FR). Required.
-q, --quality {x-low,low,medium,high,x-high}: Quality level. Default: medium.
-w, --max-workers N: Maximum concurrent downloads. Default: 3.
--stop-on-error: Stop if any download fails (default: continue on error)
--voices-dir PATH: Directory to store voices (overrides default)

Configuration defaults: Set defaults in .drinkingfountain.yaml:

voice_management:
  bulk_download_language: en_US
  bulk_download_quality: medium
  max_concurrent_downloads: 3

Examples:

Download all English (US) voices at medium quality:

drinkingfountain voices download-bulk --language en_US --quality medium

Download French voices with 5 concurrent workers, stopping on errors:

drinkingfountain voices download-bulk -l fr_FR -w 5 --stop-on-error

Use config defaults (if set):

drinkingfountain voices download-bulk

`drinkingfountain voices test`

Generate sample audio with a voice.

drinkingfountain voices test VOICE_ID TEXT [--voices-dir PATH] [--output PATH]

Examples:

# Play through speakers (if simpleaudio installed)
drinkingfountain voices test en_US-amy-medium "Hello, this is a test."

# Save to file
drinkingfountain voices test en_US-amy-medium "Testing voice quality." -o test.wav

Fountain Format

DrinkingFountain supports the Fountain screenplay format—a plain-text format for writing screenplays. Fountain is human-readable, version-control friendly, and widely used in the film industry.

Basic Elements

Scene headings: INT. LOCATION - DAY or EXT. LOCATION - NIGHT
Character names: All caps on their own line
Dialogue: Lines following a character name
Parentheticals: (text) on line between character and dialogue
Action: Any other text (descriptions, etc.)

Example Script

FADE IN:

INT. COFFEE SHOP - DAY

A cozy corner table. JOHN (30s, tired) sips his coffee.

JOHN
This is the third cup today.

SARAH (O.S.)
You have a problem.

JOHN
(looking up)
Says who?

SARAH enters, carrying a stack of books.

SARAH
Anyone with eyes.

They both laugh as the CAMERA PANS to the rain outside.

CUT TO:

EXT. STREET - NIGHT

The rain continues. Heavy.

FADE OUT.

Note: DrinkingFountain currently processes dialogue and scene headings. Action lines and transitions are included in the script structure but not spoken (they could be enabled via future configuration).

Voice Models

Where to Find Voices

Piper voice models are hosted on HuggingFace. The official repository is: https://huggingface.co/rhasspy/piper-voices

Browse available voices by language, speaker, and quality.

Naming Convention

Voice IDs follow the pattern:

{LANGUAGE}-{NAME}-{QUALITY}

LANGUAGE: en_US, en_GB, fr_FR, de_DE, etc. (language + region)
NAME: Speaker name (e.g., amy, john, sarah)
QUALITY: One of: x-low, low, medium, high, x-high

Examples:

en_US-amy-medium (American English, medium quality)
en_GB-james-high (British English, high quality)
fr_FR-henri-medium (French, medium quality)

Quality Levels

x-low: Smallest file size, lowest quality (not recommended)
low: Small, decent quality
medium: Good balance of quality and size (default choice)
high: Larger, high quality
x-high: Largest, best quality

Recommendation: Start with medium quality. If you need higher fidelity and have disk space, try high.

Listing and Downloading

List installed voices:

drinkingfountain voices list

List all voices available for download from Piper:

drinkingfountain voices available

Filter by language:

drinkingfountain voices available --language en_US

Get detailed information in JSON format:

drinkingfountain voices available --format json

Download a voice:

drinkingfountain voices download en_US-amy-medium

Voices are stored in the Piper default directory:

Linux/macOS: ~/.local/share/piper-tts/voices/
Windows: %APPDATA%/local/share/piper-tts/voices/

Override with --voices-dir if you want a custom location.

Troubleshooting

"No voices available" or "Voice model not found"

Solution: Download at least one voice model:

drinkingfountain voices download en_US-amy-medium

MP3 export fails with "ffmpeg not found"

Solution: Install ffmpeg (see Prerequisites). Alternatively, export to WAV:

drinkingfountain render script.fountain -o output.wav

Long dialogue lines get cut off or produce errors

Explanation: Piper TTS has a maximum text length (typically ~500 characters). DrinkingFountain automatically chunks long dialogue into smaller pieces and concatenates the audio with short pauses.

No action needed—this is handled transparently. If you encounter issues, ensure you're using the latest version.

Poor audio quality or robotic voice

Possible causes:

Voice model quality is too low (try high or x-high)
Voice model is corrupted or incomplete (re-download)
Sample rate mismatch (use 22050 Hz for most Piper voices)

Solutions:

Try a different voice: drinkingfountain voices download en_US-amy-high
Check your audio settings: sample_rate: 22050 is recommended for Piper
Verify the voice file exists: ls ~/.local/share/piper-tts/voices/en_US-amy-medium.onnx

"No dialogue found in script"

Cause: The Fountain file may not have properly formatted dialogue (character names not in ALL CAPS, missing blank lines).

Solution: Ensure your script follows Fountain conventions:

Character names on their own line, in ALL CAPS
Blank line before character name
Dialogue lines directly after character

Audio is too quiet or too loud

Solution: Adjust normalization settings in config:

audio:
  normalize: true
  target_level: -3.0  # Try -6.0 for quieter, -1.0 for louder

Or disable normalization and adjust manually in post.

Known Limitations

Not Yet Implemented

Prosody from parentheticals: Parenthetical cues like (whispering) or (shouting) are parsed but not yet applied to TTS output. This is planned for a future release.
Dual dialogue: Simultaneous dialogue (two characters speaking at once using ^ notation) is not supported. Lines are processed sequentially.
Non-dialogue speech: Action lines, transitions, and other non-dialogue elements are not synthesized. Only scene headings (if configured) and dialogue are included in the audio output.
GUI: DrinkingFountain is CLI-only. No graphical interface is currently planned, but the CLI is designed to be scriptable.

Platform-Specific Notes

Windows: Voice download may require additional permissions or manual download from HuggingFace if subprocess calls fail.
ARM/Mac Silicon: Piper TTS works natively on Apple Silicon. No Rosetta needed.
GPU acceleration: Not currently used—all synthesis runs on CPU.

Voice Model Availability

Piper voice models are limited to what's available on HuggingFace. Not all languages/speakers are supported.
Voice quality varies by language. English voices are most abundant and highest quality.

Development

Running Tests

Using uv:

uv run pytest

Using pytest directly:

pytest

Run with coverage:

pytest --cov=src/drinkingfountain

Pre-commit Hooks

Install pre-commit hooks to enforce code quality:

pre-commit install

This runs Ruff formatting and linting on staged files.

Project Structure

drinkingfountain/
├── src/drinkingfountain/
│   ├── __init__.py
│   ├── cli.py              # Command-line interface (Click)
│   ├── audio/
│   │   ├── mixer.py        # Audio mixing, pauses, normalization
│   │   └── __init__.py
│   ├── config/
│   │   ├── settings.py     # Configuration dataclasses
│   │   └── __init__.py
│   ├── parser/
│   │   ├── fountain.py     # Fountain format parser
│   │   ├── script.py       # Script data structures
│   │   └── __init__.py
│   ├── tts/
│   │   ├── base.py         # TTS backend interface
│   │   ├── piper.py        # Piper TTS implementation
│   │   ├── cache.py        # Caching wrapper
│   │   └── __init__.py
│   ├── utils/
│   │   ├── text_chunker.py # Long text splitting
│   │   └── __init__.py
│   └── voices/
│       ├── manager.py      # Voice assignment logic
│       └── __init__.py
├── tests/                  # Test suite
├── pyproject.toml          # Project metadata and dependencies
├── .pre-commit-config.yaml # Pre-commit configuration
└── README.md               # This file

Architecture Overview

CLI (cli.py): Entry point, parses arguments, orchestrates the pipeline
Parser (parser/fountain.py): Reads Fountain files into Script objects
Config (config/settings.py): Loads YAML configuration with validation
VoiceManager (voices/manager.py): Maps characters to voice IDs
TTS Backend (tts/piper.py): Generates audio via Piper, handles chunking
AudioMixer (audio/mixer.py): Combines segments, adds pauses, normalizes, exports

Adding New TTS Backends

The TTSBackend abstract base class (in tts/base.py) defines the interface:

class TTSBackend(Protocol):
    def is_available(self) -> bool: ...
    def list_voices(self) -> list[str]: ...
    def download_voice(self, voice: str, target_dir: Path | None) -> None: ...
    def generate_audio(self, text: str, voice: str) -> AudioSegment: ...

Implement this protocol to add support for Coqui TTS, Transformers, or cloud services.

License

MIT License. See pyproject.toml for details.

Getting Help

Bug reports: Open an issue on the project repository
Questions: Check the Fountain spec for script formatting questions
Piper TTS: See Piper documentation

Happy scripting, and may your table reads be ever in tune!

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

Mar 30, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

drinkingfountain-0.1.0.tar.gz (144.9 kB view details)

Uploaded Mar 30, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

drinkingfountain-0.1.0-py3-none-any.whl (84.1 kB view details)

Uploaded Mar 30, 2026 Python 3

File details

Details for the file drinkingfountain-0.1.0.tar.gz.

File metadata

Download URL: drinkingfountain-0.1.0.tar.gz
Upload date: Mar 30, 2026
Size: 144.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for drinkingfountain-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`59deec483db7b7ca76cb3ddf953f6346d4d858a3b4bafb64af0a68a05f68ffa9`
MD5	`81f327799ee8c4867487e04c937684ee`
BLAKE2b-256	`e1ab768ef6c19e669ca4fe1151b63b2286a99968bf3b5e9e2b66a974849a3e40`

See more details on using hashes here.

File details

Details for the file drinkingfountain-0.1.0-py3-none-any.whl.

File metadata

Download URL: drinkingfountain-0.1.0-py3-none-any.whl
Upload date: Mar 30, 2026
Size: 84.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for drinkingfountain-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`dd46abef41647143e8a81e521f9b63ce1972c53eca7c730ec304b97a30a128c8`
MD5	`5a951f50bfab0c660454f0ae96570a95`
BLAKE2b-256	`f1f85c3c6af52b6ce0aff269bd2578c9a2accf137ab7cda98b9571a3c4f15a6b`

See more details on using hashes here.

drinkingfountain 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

DrinkingFountain

Key Features

Installation

Prerequisites

Installing ffmpeg

Install DrinkingFountain

Download Voice Models

Quick Start

Configuration

Example Configuration

Voice Mapping

Audio Settings

Timing Settings

Voice Management

Consistent Voice Assignment

Narrator Voice Isolation

Voice Caching

Bulk Voice Download

Command: drinkingfountain voices download-bulk

Backward Compatibility

CLI Reference

drinkingfountain render

drinkingfountain voices

drinkingfountain voices list

drinkingfountain voices available

drinkingfountain voices download

drinkingfountain voices download-bulk

drinkingfountain voices test

Fountain Format

Basic Elements

Example Script

Voice Models

Where to Find Voices

Naming Convention

Quality Levels

Listing and Downloading

Troubleshooting

"No voices available" or "Voice model not found"

MP3 export fails with "ffmpeg not found"

Long dialogue lines get cut off or produce errors

Poor audio quality or robotic voice

"No dialogue found in script"

Audio is too quiet or too loud

Known Limitations

Not Yet Implemented

Platform-Specific Notes

Voice Model Availability

Development

Running Tests

Pre-commit Hooks

Project Structure

Architecture Overview

Adding New TTS Backends

License

Getting Help

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Command: `drinkingfountain voices download-bulk`

`drinkingfountain render`

`drinkingfountain voices`

`drinkingfountain voices list`

`drinkingfountain voices available`

`drinkingfountain voices download`

`drinkingfountain voices download-bulk`

`drinkingfountain voices test`