Skip to main content

A command-line tool for transcribing audio files in a folder to a metadata.csv file, using OpenAI's Whisper.

Project description

Trainscribe

Trainscribe is a command-line tool that transcribes audio files in a specified folder using OpenAI's Whisper and generates a metadata.csv file. The produced metadata file is intended to use in training/finetune of text to speech (TTS) models, and may use one of the following formats:

  • file_id|transcribed_text, or
  • file_id|transcribed_text|speaker, if a speaker label is provided.

This is similar to LJ Speech format, but lacks an additional field with normalized transcribed text for pronuciation. Particularly, file_id|transcribed_text may be used in projects like piper-train, and file_id|transcribed_text|speaker in xtts-finetune.

Requirements

  • Python >=3.10, <3.14
  • uv
  • ffmpeg (install with sudo apt install ffmpeg)

Usage

Run the tool with:

uvx trainscribe --folder /path/to/audio/folder [options]
Transcribe a folder of audio files to metadata.csv using Whisper.

options:
  -h, --help            show this help message and exit
  --folder, -f FOLDER   Folder with audio files
  --lang, -l LANG       Language code for transcription (e.g. 'en')
  --model, -m MODEL     Whisper model name (tiny, base, small, medium, large, turbo)
  --speaker, -s SPEAKER
                        Speaker label to add to metadata lines
  --device, -d DEVICE   Device for whisper model (cuda/cpu)
  --output, -o OUTPUT

Example

Transcribe English audio in dataset/wavs using the medium model:

uvx trainscribe --folder dataset/wavs --lang en --model medium 

This generates dataset/wavs/metadata.csv

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

trainscribe-0.1.2.tar.gz (3.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

trainscribe-0.1.2-py3-none-any.whl (4.2 kB view details)

Uploaded Python 3

File details

Details for the file trainscribe-0.1.2.tar.gz.

File metadata

  • Download URL: trainscribe-0.1.2.tar.gz
  • Upload date:
  • Size: 3.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.8

File hashes

Hashes for trainscribe-0.1.2.tar.gz
Algorithm Hash digest
SHA256 e9aaa2f37c4d0a1342d2ab4093275ee914751560fe958d35bdbf0f379a0340c0
MD5 5755103b0ce4f6f8102edbe54273a766
BLAKE2b-256 cbf4bebc1504c264f7334a94ecb133f0c16ef2b23f8c3284a8fcaa15728c5332

See more details on using hashes here.

File details

Details for the file trainscribe-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for trainscribe-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 c1eae31334121c34801dfa26c89ff1bd39504abcb62a6bfd47c563e6f83250b3
MD5 cb4f2114e8789d08408d3bc2503bee2d
BLAKE2b-256 4a0cd163c393571d2bde84e13900020c89b32b2dc264b28f14945ede7fce2765

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page