Skip to main content

Transcribe audio, convert GIFs to MP4, and speak text - one cross-platform CLI

Project description

Media Tools hero image showing audio, video, and text workflows converging into a command-line media utility

Media Tools

Cross-platform command-line helpers for everyday media jobs: transcribe audio and video, convert GIFs to MP4, and render text to speech with Piper.

Media Tools is built as a normal Python package with one CLI, media-tools, and small optional file-manager wrappers. It is intended to work on Windows, macOS, and Linux without tying the project to one desktop environment.

What It Does

  • Transcribes audio and video to .txt and .vtt using faster-whisper.
  • Converts animated GIFs to browser-friendly .mp4 files using ffmpeg.
  • Renders text files to .wav audio using Piper TTS voices.
  • Installs into a local .venv with setup helpers for Linux, macOS, and Windows.
  • Keeps the core implementation in reusable Python modules under src/media_tools/.

Quick Install

Linux and macOS:

scripts/setup.sh --install-system-deps --download-piper-voice

Windows PowerShell:

.\scripts\setup.ps1 -InstallSystemDeps -DownloadPiperVoice

These commands create .venv, install the Python package and dependencies, optionally install ffmpeg, and optionally download the default Piper voice files.

For more install options, see docs/INSTALLATION.md.

Quick Start

media-tools transcribe path/to/file.mp4
media-tools gif-to-mp4 path/to/animation.gif
media-tools speak path/to/script.txt

Equivalent module form:

python -m media_tools --help

Commands

Transcribe Media

media-tools transcribe interview.mp4

Outputs:

  • interview.txt
  • interview.vtt

Useful options:

media-tools transcribe interview.mp4 --model small --language en --output-dir transcripts

Convert GIF to MP4

media-tools gif-to-mp4 clip.gif

Outputs:

  • clip.mp4

Useful options:

media-tools gif-to-mp4 clip.gif --output-dir converted

Speak Text with Piper

media-tools speak narration.txt

Outputs:

  • narration.wav

Useful options:

media-tools speak narration.txt --voice en_US-lessac-high --output-dir audio

More examples are in docs/USAGE.md.

Requirements

  • Python 3.10 or newer
  • Internet access during first install so pip can download Python dependencies
  • ffmpeg on PATH for GIF conversion
  • Piper voice model files for text-to-speech

Supported automatic ffmpeg installers:

  • Linux: apt-get, dnf, yum, pacman, zypper, or apk
  • macOS: Homebrew
  • Windows: winget, Chocolatey, or Scoop

Piper Voices

By default, media-tools speak looks for en_US-lessac-high.onnx and en_US-lessac-high.onnx.json in the platform cache directory:

  • Windows: %LOCALAPPDATA%\media-tools\Cache\voices\piper
  • macOS: ~/Library/Caches/media-tools/voices/piper
  • Linux: ${XDG_CACHE_HOME:-~/.cache}/media-tools/voices/piper

You can override this with MEDIA_TOOLS_PIPER_VOICE_DIR, --voice-dir, or --model-path.

Documentation

Development

Run the pure unit tests without requiring Whisper, Piper, or ffmpeg at test time:

PYTHONPATH=src python -m unittest discover -s tests

Project layout:

  • src/media_tools/ core package and CLI
  • scripts/ optional setup and file-manager helper scripts
  • tests/ unit tests for pure logic
  • assets/ README and marketing assets
  • docs/ public documentation

License

MIT. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

voxreel-0.1.2.tar.gz (10.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

voxreel-0.1.2-py3-none-any.whl (9.4 kB view details)

Uploaded Python 3

File details

Details for the file voxreel-0.1.2.tar.gz.

File metadata

  • Download URL: voxreel-0.1.2.tar.gz
  • Upload date:
  • Size: 10.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for voxreel-0.1.2.tar.gz
Algorithm Hash digest
SHA256 ff2dff13333f8a4437a782d594d702a854a0b695e00460b9391bb12551cbf07b
MD5 e71d5abf8250308fd69616283e556bb7
BLAKE2b-256 9071514f67f33265e15d481a122efed8ee01bcb4955eee3fb60034cfd95b6ee0

See more details on using hashes here.

File details

Details for the file voxreel-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: voxreel-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 9.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for voxreel-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 e96920325cfcc7d3fddac6b921d102456b036974bcd187244f0587bb973c9f39
MD5 ceda3e8da4c4ef0d882fd5df857429a6
BLAKE2b-256 579bf5761a54198028c8048bc673138454c6c8195ae8dc9b6cbd663c97d41630

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page