A simple CLI tool for text-to-speech using OpenAI's API

Project description

Wiz TTS

A simple command-line tool for text-to-speech using OpenAI's API, featuring real-time FFT visualization.

Installation

uv tool install -U wiz-tts

# or if you prefer pip
pip install wiz-tts

Usage

After installation, you can run the tool with:

# Recommended: run with uv for best performance
uv run -- wiz-tts "Your text to convert to speech"

# Alternatively, run directly
wiz-tts "Your text to convert to speech"

Or pipe text from another command:

echo "Your text" | uv run -- wiz-tts
cat file.txt | uv run -- wiz-tts

Fun Zsh Function

genz() {
    [[ -z "$1" ]] && echo "Error: URL is required" >&2 && return 1
    local content=$(w3m -dump "$1")
    local persona="gen-z podcaster, cringe but cool and on fire"
    ollama run llama3.2 <<PROMPT | tee /dev/tty | wiz-tts -i "$persona"
Analyze the contents of this webpage:
$content
---
Act as a ${persona}, and compose a succinct monologue.
[no intros or closing, just the meat\!]
PROMPT
}

Example as Mac Shortcut.app Script

exec /opt/homebrew/bin/zsh -i -c "$(cat <<SHELL
INSTRUCTIONS="Voice: Warm, conversational, and authentic, with natural vocal variety and a relaxed pace that invites listeners to settle in for extended listening. Occasional subtle emphasis on key points without sounding rehearsed.

Punctuation: Thoughtful pauses that allow ideas to breathe, varied sentence lengths to maintain interest, and natural breaks that mimic genuine conversation rather than reading.

Delivery: Balanced and rhythmic with a dynamic range that prevents monotony, incorporating strategic shifts in pace and volume to highlight important information or create narrative tension.

Phrasing: Accessible and relatable, using conversational language with occasional colorful expressions or personal anecdotes to build connection. Ideas flow naturally from one to the next with smooth transitions.

Tone: Engaging and inclusive, striking a balance between informative and entertaining, with moments of gentle humor, curiosity, and genuine enthusiasm speaking with a knowledgeable friend.
"
pbpaste | wiz-tts --voice verse --data-dir "${HOME}/Downloads/wiz-tts-data" --instructions "\$INSTRUCTIONS"
SHELL
)"

Options

usage: wiz-tts [-h] [--voice {alloy,echo,fable,onyx,nova,shimmer,coral}] [--instructions INSTRUCTIONS]
               [--model {tts-1,tts-1-hd,gpt-4o-mini-tts}] [--data-dir DATA_DIR] [--split {period,paragraph}]
               [text]

Convert text to speech with visualization

positional arguments:
  text                  Text to convert to speech (default: reads from stdin or uses a sample text)

options:
  -h, --help            show this help message and exit
  --voice {alloy,echo,fable,onyx,nova,shimmer,coral}, -v {alloy,echo,fable,onyx,nova,shimmer,coral}
                        Voice to use for speech (default: coral)
  --instructions INSTRUCTIONS, -i INSTRUCTIONS
                        Instructions for the speech style
  --model {tts-1,tts-1-hd,gpt-4o-mini-tts}, -m {tts-1,tts-1-hd,gpt-4o-mini-tts}
                        TTS model to use (default: tts-1)
  --split {period,paragraph}
                        Split input text by periods or paragraphs and add natural pauses
  --data-dir DATA_DIR, -d DATA_DIR
                        Directory to save audio files and metadata (default: $WIZ_TTS_DATA_DIR if set)
  --bitrate BITRATE, -b BITRATE
                        Audio bitrate for saved files (default: 24k)

Examples

Basic usage:

uv run -- wiz-tts "Hello, world!"

Using stdin:

echo "Hello from stdin" | uv run -- wiz-tts

Using a different voice:

uv run -- wiz-tts --voice nova "Welcome to the future of text to speech!"

Adding speech instructions:

uv run -- wiz-tts --voice shimmer --instructions "Speak slowly and clearly" "This is important information."

Using a different model:

uv run -- wiz-tts --model tts-1-hd "This will be rendered in high definition."

Processing a text file:

cat story.txt | uv run -- wiz-tts --voice echo

Splitting text by sentences with natural pauses:

uv run -- wiz-tts --split period "First sentence. Second sentence. And a third one!"

Splitting text by paragraphs with longer pauses:

cat essay.txt | uv run -- wiz-tts --split paragraph --voice verse

Saving audio to a directory (files are saved as they start generating):

uv run -- wiz-tts "Save this speech to a file" --data-dir ./saved_audio

Using environment variable for audio saving:

# Set the environment variable
export WIZ_TTS_DATA_DIR=./saved_speeches

# Run without --data-dir, audio will be saved to ./saved_speeches
uv run -- wiz-tts "This will be saved using the environment variable"

Saving with custom audio compression:

# Higher bitrate for better quality but larger file size
uv run -- wiz-tts "High quality audio" --data-dir ./saved_audio --bitrate 64k

# Lower bitrate for smaller file size
uv run -- wiz-tts "Compressed audio" --data-dir ./saved_audio --bitrate 16k

Features

Converts text to speech using OpenAI's TTS API
Real-time FFT (Fast Fourier Transform) visualization during playback
Multiple voice options
Custom speech style instructions
Reads text from command line arguments or stdin
Supports multiple TTS models
Text splitting with natural pauses between segments
Option to save generated audio as WebM files with metadata and configurable compression
Saves metadata and audio files immediately as they're being generated (no need to wait for completion)

Requirements

Python 3.12 or higher
An OpenAI API key set in your environment variables as OPENAI_API_KEY

Environment Variables

OPENAI_API_KEY: Your OpenAI API key (required)
WIZ_TTS_DATA_DIR: Default directory for saving audio files (optional, can be overridden with --data-dir)

License

MIT

Project details

Release history Release notifications | RSS feed

This version

0.5.0

May 2, 2025

0.4.2

Apr 21, 2025

0.4.0

Apr 21, 2025

0.3.1

Apr 21, 2025

0.3.0

Apr 19, 2025

0.2.1

Apr 19, 2025

0.2.0

Apr 18, 2025

0.1.2

Apr 18, 2025

0.1.1

Apr 17, 2025

0.1.0

Apr 17, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wiz_tts-0.5.0.tar.gz (16.1 kB view details)

Uploaded May 2, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

wiz_tts-0.5.0-py3-none-any.whl (16.1 kB view details)

Uploaded May 2, 2025 Python 3

File details

Details for the file wiz_tts-0.5.0.tar.gz.

File metadata

Download URL: wiz_tts-0.5.0.tar.gz
Upload date: May 2, 2025
Size: 16.1 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for wiz_tts-0.5.0.tar.gz
Algorithm	Hash digest
SHA256	`0b2b5a61da7638d4c70927b0cc389caebaa01193ba32f32ea617838a5be119bd`
MD5	`64bb2fd6839373e1ebb93c2d0f979dad`
BLAKE2b-256	`d74432516d7326f0e4eeed5f41161821bfdb6facaacff0de1fff055366bcb79c`

See more details on using hashes here.

Provenance

The following attestation bundles were made for wiz_tts-0.5.0.tar.gz:

Publisher: python-publish.yml on ddrscott/wiz-tts

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: wiz_tts-0.5.0.tar.gz
- Subject digest: 0b2b5a61da7638d4c70927b0cc389caebaa01193ba32f32ea617838a5be119bd
- Sigstore transparency entry: 206192424
- Sigstore integration time: May 2, 2025
Source repository:
- Permalink: ddrscott/wiz-tts@be054ac8613b97fc4b6192f9239813362cfda467
- Branch / Tag: refs/tags/v0.5.0
- Owner: https://github.com/ddrscott
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@be054ac8613b97fc4b6192f9239813362cfda467
- Trigger Event: release

File details

Details for the file wiz_tts-0.5.0-py3-none-any.whl.

File metadata

Download URL: wiz_tts-0.5.0-py3-none-any.whl
Upload date: May 2, 2025
Size: 16.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for wiz_tts-0.5.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`518a2da903ba528ab0d71d6680973c0a20f57b68bd09ebcfe7bb5d2bd5b04b77`
MD5	`a46d895139a641cee35577f2545f86b6`
BLAKE2b-256	`e133a8cb6cf8754585a8e2aabb354c5326d14a95a509374eb1fa540896792949`

See more details on using hashes here.

Provenance

The following attestation bundles were made for wiz_tts-0.5.0-py3-none-any.whl:

Publisher: python-publish.yml on ddrscott/wiz-tts

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: wiz_tts-0.5.0-py3-none-any.whl
- Subject digest: 518a2da903ba528ab0d71d6680973c0a20f57b68bd09ebcfe7bb5d2bd5b04b77
- Sigstore transparency entry: 206192426
- Sigstore integration time: May 2, 2025
Source repository:
- Permalink: ddrscott/wiz-tts@be054ac8613b97fc4b6192f9239813362cfda467
- Branch / Tag: refs/tags/v0.5.0
- Owner: https://github.com/ddrscott
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@be054ac8613b97fc4b6192f9239813362cfda467
- Trigger Event: release

wiz-tts 0.5.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Wiz TTS

Installation

Usage

Fun Zsh Function

Example as Mac Shortcut.app Script

Options

Examples

Features

Requirements

Environment Variables

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance