Skip to main content

Transcribe audio, convert GIFs to MP4, and speak text - one cross-platform CLI

Project description

Media Tools hero image showing audio, video, and text workflows converging into a command-line media utility

Media Tools

Cross-platform command-line helpers for everyday media jobs: transcribe audio and video, convert GIFs to MP4, and render text to speech with Piper.

Media Tools is built as a normal Python package with one CLI, media-tools, and small optional file-manager wrappers. It is intended to work on Windows, macOS, and Linux without tying the project to one desktop environment.

What It Does

  • Transcribes audio and video to .txt and .vtt using faster-whisper.
  • Converts animated GIFs to browser-friendly .mp4 files using ffmpeg.
  • Renders text files to .wav audio using Piper TTS voices.
  • Installs into a local .venv with setup helpers for Linux, macOS, and Windows.
  • Keeps the core implementation in reusable Python modules under src/media_tools/.

Quick Install

Linux and macOS:

scripts/setup.sh --install-system-deps --download-piper-voice

Windows PowerShell:

.\scripts\setup.ps1 -InstallSystemDeps -DownloadPiperVoice

These commands create .venv, install the Python package and dependencies, optionally install ffmpeg, and optionally download the default Piper voice files.

For more install options, see docs/INSTALLATION.md.

Quick Start

media-tools transcribe path/to/file.mp4
media-tools gif-to-mp4 path/to/animation.gif
media-tools speak path/to/script.txt

Equivalent module form:

python -m media_tools --help

Commands

Transcribe Media

media-tools transcribe interview.mp4

Outputs:

  • interview.txt
  • interview.vtt

Useful options:

media-tools transcribe interview.mp4 --model small --language en --output-dir transcripts

Convert GIF to MP4

media-tools gif-to-mp4 clip.gif

Outputs:

  • clip.mp4

Useful options:

media-tools gif-to-mp4 clip.gif --output-dir converted

Speak Text with Piper

media-tools speak narration.txt

Outputs:

  • narration.wav

Useful options:

media-tools speak narration.txt --voice en_US-lessac-high --output-dir audio

More examples are in docs/USAGE.md.

Requirements

  • Python 3.10 or newer
  • Internet access during first install so pip can download Python dependencies
  • ffmpeg on PATH for GIF conversion
  • Piper voice model files for text-to-speech

Supported automatic ffmpeg installers:

  • Linux: apt-get, dnf, yum, pacman, zypper, or apk
  • macOS: Homebrew
  • Windows: winget, Chocolatey, or Scoop

Piper Voices

By default, media-tools speak looks for en_US-lessac-high.onnx and en_US-lessac-high.onnx.json in the platform cache directory:

  • Windows: %LOCALAPPDATA%\media-tools\Cache\voices\piper
  • macOS: ~/Library/Caches/media-tools/voices/piper
  • Linux: ${XDG_CACHE_HOME:-~/.cache}/media-tools/voices/piper

You can override this with MEDIA_TOOLS_PIPER_VOICE_DIR, --voice-dir, or --model-path.

Documentation

Development

Run the pure unit tests without requiring Whisper, Piper, or ffmpeg at test time:

PYTHONPATH=src python -m unittest discover -s tests

Project layout:

  • src/media_tools/ core package and CLI
  • scripts/ optional setup and file-manager helper scripts
  • tests/ unit tests for pure logic
  • assets/ README and marketing assets
  • docs/ public documentation

License

MIT. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

voxreel-0.1.1.tar.gz (9.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

voxreel-0.1.1-py3-none-any.whl (9.4 kB view details)

Uploaded Python 3

File details

Details for the file voxreel-0.1.1.tar.gz.

File metadata

  • Download URL: voxreel-0.1.1.tar.gz
  • Upload date:
  • Size: 9.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for voxreel-0.1.1.tar.gz
Algorithm Hash digest
SHA256 ed416ee613f41cb2d3e2af18534d341e0397792bb081e63d6abbaac8a1ea0626
MD5 b20376adf3781074eb697068e2663b0e
BLAKE2b-256 173b3dc874c6895ac099e71b8f954a1a39755fa5f6a45e91563bdf1f740c9f6d

See more details on using hashes here.

File details

Details for the file voxreel-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: voxreel-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 9.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for voxreel-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ddd4026094b1a09ea92d6cd63d5ce446a632dc2d673c7b5f626026a263058093
MD5 94f8ea0975ea23e938c0ae06a73cc393
BLAKE2b-256 14daa01d12b523149e8bf8e3d35709e6fb255f446c69ab685ed08961547f11d0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page