Skip to main content

Platform-neutral media tooling for transcription, GIF conversion, and Piper text-to-speech

Project description

Media Tools hero image showing audio, video, and text workflows converging into a command-line media utility

Media Tools

Cross-platform command-line helpers for everyday media jobs: transcribe audio and video, convert GIFs to MP4, and render text to speech with Piper.

Media Tools is built as a normal Python package with one CLI, media-tools, and small optional file-manager wrappers. It is intended to work on Windows, macOS, and Linux without tying the project to one desktop environment.

What It Does

  • Transcribes audio and video to .txt and .vtt using faster-whisper.
  • Converts animated GIFs to browser-friendly .mp4 files using ffmpeg.
  • Renders text files to .wav audio using Piper TTS voices.
  • Installs into a local .venv with setup helpers for Linux, macOS, and Windows.
  • Keeps the core implementation in reusable Python modules under src/media_tools/.

Quick Install

Linux and macOS:

scripts/setup.sh --install-system-deps --download-piper-voice

Windows PowerShell:

.\scripts\setup.ps1 -InstallSystemDeps -DownloadPiperVoice

These commands create .venv, install the Python package and dependencies, optionally install ffmpeg, and optionally download the default Piper voice files.

For more install options, see docs/INSTALLATION.md.

Quick Start

media-tools transcribe path/to/file.mp4
media-tools gif-to-mp4 path/to/animation.gif
media-tools speak path/to/script.txt

Equivalent module form:

python -m media_tools --help

Commands

Transcribe Media

media-tools transcribe interview.mp4

Outputs:

  • interview.txt
  • interview.vtt

Useful options:

media-tools transcribe interview.mp4 --model small --language en --output-dir transcripts

Convert GIF to MP4

media-tools gif-to-mp4 clip.gif

Outputs:

  • clip.mp4

Useful options:

media-tools gif-to-mp4 clip.gif --output-dir converted

Speak Text with Piper

media-tools speak narration.txt

Outputs:

  • narration.wav

Useful options:

media-tools speak narration.txt --voice en_US-lessac-high --output-dir audio

More examples are in docs/USAGE.md.

Requirements

  • Python 3.10 or newer
  • Internet access during first install so pip can download Python dependencies
  • ffmpeg on PATH for GIF conversion
  • Piper voice model files for text-to-speech

Supported automatic ffmpeg installers:

  • Linux: apt-get, dnf, yum, pacman, zypper, or apk
  • macOS: Homebrew
  • Windows: winget, Chocolatey, or Scoop

Piper Voices

By default, media-tools speak looks for en_US-lessac-high.onnx and en_US-lessac-high.onnx.json in the platform cache directory:

  • Windows: %LOCALAPPDATA%\media-tools\Cache\voices\piper
  • macOS: ~/Library/Caches/media-tools/voices/piper
  • Linux: ${XDG_CACHE_HOME:-~/.cache}/media-tools/voices/piper

You can override this with MEDIA_TOOLS_PIPER_VOICE_DIR, --voice-dir, or --model-path.

Documentation

Development

Run the pure unit tests without requiring Whisper, Piper, or ffmpeg at test time:

PYTHONPATH=src python -m unittest discover -s tests

Project layout:

  • src/media_tools/ core package and CLI
  • scripts/ optional setup and file-manager helper scripts
  • tests/ unit tests for pure logic
  • assets/ README and marketing assets
  • docs/ public documentation

License

MIT. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

voxreel-0.1.0.tar.gz (9.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

voxreel-0.1.0-py3-none-any.whl (9.4 kB view details)

Uploaded Python 3

File details

Details for the file voxreel-0.1.0.tar.gz.

File metadata

  • Download URL: voxreel-0.1.0.tar.gz
  • Upload date:
  • Size: 9.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for voxreel-0.1.0.tar.gz
Algorithm Hash digest
SHA256 26be8c662dad33e87327e2bf7709a5ede420a3f1edb6858eb9bf1a22acb49981
MD5 07b514420884785ea317dced9517f212
BLAKE2b-256 1e555b783e18f63b300aec133b593b373a49e3e08c061698705f2916f4a52f5b

See more details on using hashes here.

File details

Details for the file voxreel-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: voxreel-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 9.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for voxreel-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b5594d628a61f7c7979df67076101abb39e04a3a6487664e5e68257b57203ef8
MD5 7fd2f9d01bc49c60d6c538291a1906b1
BLAKE2b-256 6dbb31a289a3cde61fe0477db36152f6170dd99aec6a42fe212e301cb4b4c64f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page