Skip to main content

A command-line tool for transcribing audio files in a folder to a metadata.csv file, using OpenAI's Whisper.

Project description

Trainscribe

Trainscribe is a command-line tool that transcribes audio files in a specified folder using OpenAI's Whisper and generates a metadata.csv file. The produced metadata file is intended to use in training/finetune of text to speech (TTS) models, and may use one of the following formats:

  • file_id|transcribed_text, or
  • file_id|transcribed_text|speaker, if a speaker label is provided.

This is similar to LJ Speech format, but lacks an additional field with normalized transcribed text for pronuciation. Particularly, file_id|transcribed_text may be used in projects like piper-train, and file_id|transcribed_text|speaker in xtts-finetune.

Requirements

  • Python 3.10+
  • uv
  • ffmpeg (install with sudo apt install ffmpeg)

Usage

Run the tool with:

uvx trainscribe --folder /path/to/audio/folder [options]
Transcribe a folder of audio files to metadata.csv using Whisper.

options:
  -h, --help            show this help message and exit
  --folder, -f FOLDER   Folder with audio files
  --lang, -l LANG       Language code for transcription (e.g. 'en')
  --model, -m MODEL     Whisper model name (tiny, base, small, medium, large, turbo)
  --speaker, -s SPEAKER
                        Speaker label to add to metadata lines
  --device, -d DEVICE   Device for whisper model (cuda/cpu)
  --output, -o OUTPUT

Example

Transcribe English audio in dataset/wavs using the medium model:

uvx trainscribe --folder dataset/wavs --lang en --model medium 

This generates dataset/wavs/metadata.csv

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

trainscribe-0.1.0.tar.gz (3.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

trainscribe-0.1.0-py3-none-any.whl (4.2 kB view details)

Uploaded Python 3

File details

Details for the file trainscribe-0.1.0.tar.gz.

File metadata

  • Download URL: trainscribe-0.1.0.tar.gz
  • Upload date:
  • Size: 3.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.8

File hashes

Hashes for trainscribe-0.1.0.tar.gz
Algorithm Hash digest
SHA256 41658674010f81bc10c048723e7c91b312666e02345f58037bc2a0011ad8856c
MD5 9cff70c5f3839dc19a066130132448c5
BLAKE2b-256 a87d52a6c79cb1480ea644a45134128dac99e07c6fd5b25ead12f3dbd5914566

See more details on using hashes here.

File details

Details for the file trainscribe-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for trainscribe-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 715e93c54d373422138ebd56cd9d74904949012e73220be547d340a53d3b6dc4
MD5 840c8ce0fd5f29c883ad8a59ad982eac
BLAKE2b-256 b55e0cf35f83dcac16abd0d5c2f74cdcfe1cd16555bc9e8f9b054a0a6c294d46

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page