Skip to main content

Transcribe and/ot translate all soundfiles in a folder using Whisper

Project description

TranscribeTools

Introduction

TranscribeTools is a collection of commandline tools for transcription and translation, which currently only includes TranscribeFolder. Transcribefolder is a Python application that transcribes all sound files in a configurable folder using a local version of the Whisper model. Whisper is an automatic speech recognition system created by OpenAI, which can transcribe audio files in multiple languages and translate those languages into English.

The model must be run locally to comply with the General Data Protection Regulation (GDPR). This is because, when using OpenAI’s transcription service (based on the Whisper model), OpenAI could collect user data from prompts and files uploaded by the user. These audio files may contain personal data from which people can be identified. Therefore, using OpenAI’s service without a processing agreement is not allowed within European organizations.

On the other hand, using TranscribeTools to run the Whisper model on your own device means that files containing personal data will not be collected. The program essentially downloads the model — released as open-source software in 2022 — and uses the command line to select a folder, which it then transcribes, all locally.

It works with audio files under 25 MB in the following formats: mp3, wav, mp4, mpeg, mpga, m4a, and webm. It also allows the user to choose the model size. The larger models are more accurate but slower, while the smaller models are faster but less accurate. One exception is the turbo model, which is an optimized version of the large model that is relatively quick with a minimal decrease in accuracy.

Furthermore, the application uses the terminal, a text-based interface to interact with the computer, to install and use Whisper. This might sound intimidating but is hopefully manageable when following the instructions given below. The terminal is already installed in most cases.

Details

License

This project is licensed under the Apache 2.0 License – see the LICENSE file for details.

Setup

Before installing TranscribeTools, you need to download a package manager to install dependencies—pieces of code that the application relies on. On macOS, we will use Homebrew and uv; on Windows, we will only use uv. Then, we will install TranscribeTools.

To run the following prompts, one must copy and paste the commands in the command line and press the {enter} key after each line. During the setup, it might be necessary to restart the terminal after installing homebrew, uv, or transcribetools to be able to proceed.

Package manager

On Windows

  1. Open Windows PowerShell or the Command shell

  2. Run prompt to install uv:

powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"

On macOS:

  1. Open Terminal

  2. Run prompt to install brew:

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
  1. Run prompt to install uv:
brew install uv

Install tools

  1. Install the (commandline) tools in this project. For now, it's only transcribefolder:
uv tool install transcribetools

Command-line usage

Getting started

To get started with transcribefolder, simply follow the instructions below. Before running transcribefolder the first time, it’s a good idea to switch to your home directory (your personal folder).

cd ~

The first time you run the tool, a configuration file will be created with the current folder and model, which will be used from then on. If needed, you can update the configuration by running the command:

transcribefolder config create
  1. Run the prompt
transcribefolder transcribe
  1. Select which folder to transcribe
  2. Enter the name of the Whisper model you'd like to use
  3. Press enter to use the default configuration file name

Prompt list

Run prompt to see the possible commands and options:

transcribefolder --help

Run prompt to create a configuration file with the right folder to transcribe and the right whisper model to use:

transcribefolder config create

Run prompt to show the default configuration file (transcribefolder.toml):

transcribefolder config show

Run prompt to show the specified configuration file:

transcribefolder -c [name of the config file.toml] config show

Run prompt to transcribe all sound files in the selected folder using the default configuration file (transcribefolder.toml):

transcribefolder transcribe

Run prompt to transcribe a single sound file:

transcribefolder transcribe -f [path to a single sound file]

Run prompt to transcribe all sound files in the selected folder using a specific configuration file:

transcribefolder -c [name of the config file.toml] transcribe

Known issues

  • The deepl_translate command is not yet working.
  • FIXED: The duration and realtime factor are not available for processed files in the formats: mp4, mpeg, mpga, m4a, and webm.

Plans

  • Support bigger files, use ffmpeg to chunk them.
  • Make it a local service, running in the background
  • Investigate options to let it run on a central computer, as a service
  • Create a Docker image
  • Add speaker partitioning (see TranscribeWhisperX)
  • Adjust models using PyTorch (more control)

Documentation about Whisper on the cloud and local

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

transcribetools-0.6.3.tar.gz (3.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

transcribetools-0.6.3-py3-none-any.whl (14.5 kB view details)

Uploaded Python 3

File details

Details for the file transcribetools-0.6.3.tar.gz.

File metadata

  • Download URL: transcribetools-0.6.3.tar.gz
  • Upload date:
  • Size: 3.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for transcribetools-0.6.3.tar.gz
Algorithm Hash digest
SHA256 a621726f0926d42c7947f30e751c908daef7b76db6701d6654bf1d9cb474a847
MD5 8ba8e735d78683aea6d683d5c9847a56
BLAKE2b-256 45a06d3018bc8fbd99f1ca3d6bcee370266bd8a69d75a7a62bf82a9e62afe01c

See more details on using hashes here.

File details

Details for the file transcribetools-0.6.3-py3-none-any.whl.

File metadata

  • Download URL: transcribetools-0.6.3-py3-none-any.whl
  • Upload date:
  • Size: 14.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for transcribetools-0.6.3-py3-none-any.whl
Algorithm Hash digest
SHA256 5be68fcd37ad3ce463d25afc6c4899f1dd9a2fb6b96f503e49d12fd1c2001d07
MD5 3b17f0c84d7e3fa75f3a924f21f4e0f8
BLAKE2b-256 788e8fc5da7eaab66a8038fb98cd3272a7c99bf2417ab6ded057b34c9866a5a1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page