Skip to main content

A command-line tool for transcribing audio files in a folder to a metadata.csv file, using OpenAI's Whisper.

Project description

Trainscribe

Trainscribe is a command-line tool that transcribes audio files in a specified folder using OpenAI's Whisper and generates a metadata.csv file. The produced metadata file is intended to use in training/finetune of text to speech (TTS) models, and may use one of the following formats:

  • file_id|transcribed_text, or
  • file_id|transcribed_text|speaker, if a speaker label is provided.

This is similar to LJ Speech format, but lacks an additional field with normalized transcribed text for pronuciation. Particularly, file_id|transcribed_text may be used in projects like piper-train, and file_id|transcribed_text|speaker in xtts-finetune.

Requirements

  • Python 3.10+
  • uv
  • ffmpeg (install with sudo apt install ffmpeg)

Usage

Run the tool with:

uvx trainscribe --folder /path/to/audio/folder [options]
Transcribe a folder of audio files to metadata.csv using Whisper.

options:
  -h, --help            show this help message and exit
  --folder, -f FOLDER   Folder with audio files
  --lang, -l LANG       Language code for transcription (e.g. 'en')
  --model, -m MODEL     Whisper model name (tiny, base, small, medium, large, turbo)
  --speaker, -s SPEAKER
                        Speaker label to add to metadata lines
  --device, -d DEVICE   Device for whisper model (cuda/cpu)
  --output, -o OUTPUT

Example

Transcribe English audio in dataset/wavs using the medium model:

uvx trainscribe --folder dataset/wavs --lang en --model medium 

This generates dataset/wavs/metadata.csv

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

trainscribe-0.1.1.tar.gz (3.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

trainscribe-0.1.1-py3-none-any.whl (4.2 kB view details)

Uploaded Python 3

File details

Details for the file trainscribe-0.1.1.tar.gz.

File metadata

  • Download URL: trainscribe-0.1.1.tar.gz
  • Upload date:
  • Size: 3.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.8

File hashes

Hashes for trainscribe-0.1.1.tar.gz
Algorithm Hash digest
SHA256 1faa24be9f21396b99a7483b6d8ac3acd5a5eec25c57ac107721b560424822bf
MD5 2e4f75dcaaa92c28c7c6a31bf5bd42fd
BLAKE2b-256 83bcd0d1679cbebf231682460c81587f2bb718f54648e84dbd09f7690c5932cd

See more details on using hashes here.

File details

Details for the file trainscribe-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for trainscribe-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 fcec3281e0a50f422ab5445a66d62f2cd6e66df1530c77e1c47f8979705f790e
MD5 77195f5fefbd3248382f1d012df038ee
BLAKE2b-256 004a2e968a9b87e2ce45a03a9855ad4498123faa939119a4d39f5a6a9ff1abc6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page