A command-line tool for audio transcription with Whisper and Pyannote.

These details have not been verified by PyPI

Project links

Project description

Audio Scribe

A Command-Line Tool for Audio Transcription and Speaker Diarization Using OpenAI Whisper and Pyannote

Coverage

Overview

Audio Scribe is a command-line tool that transcribes audio files with speaker diarization. Leveraging OpenAI Whisper for transcription and Pyannote Audio for speaker diarization, this solution converts audio into segmented text files, identifying each speaker turn. Key features include:

Progress Bar & Resource Monitoring: See real-time CPU, memory, and GPU usage with a live progress bar.
Speaker Diarization: Automatically separates speaker turns using Pyannote’s state-of-the-art models.
Tab-Completion for File Paths: Easily navigate your file system when prompted for the audio path.
Secure Token Storage: Encrypts and stores your Hugging Face token for private model downloads.
Customizable Whisper Models: Default to base.en, or specify tiny, small, medium, large, etc.

This repository is licensed under the Apache License 2.0.

Audio Scribe

Features

Whisper Transcription
Utilizes OpenAI Whisper to convert speech to text in multiple languages.
Pyannote Speaker Diarization
Identifies different speakers and segments your audio output accordingly.
Progress Bar & Resource Usage
Displays a live progress bar with CPU, memory, and GPU stats through alive-progress, psutil, and GPUtil.
Tab-Completion
Press Tab to autocomplete file paths on Unix-like systems (and on Windows with pyreadline3).
Secure Token Storage
Saves your Hugging Face token via cryptography for model downloads (e.g., pyannote/speaker-diarization-3.1).
Configurable Models
Default is base.en but you can specify any other Whisper model using --whisper-model.

Installation

Installing from PyPI

Audio Scribe is available on PyPI. You can install it with:

pip install audio-scribe

After installation, the audio-scribe command should be available in your terminal (depending on how your PATH is configured). If you prefer to run via Python module, you can also do:

python -m audio-scribe --audio path/to/yourfile.wav

Installing from GitHub

To install the latest development version directly from GitHub:

git clone https://gitlab.genomicops.cloud/genomicops/audio-scribe.git
cd audio-scribe
pip install -r requirements.txt

This approach is particularly useful if you want the newest changes or plan to contribute.

Quick Start

Obtain a Hugging Face Token
- Create a token at Hugging Face Settings.
- Accept the model conditions for pyannote/segmentation-3.0 and pyannote/speaker-diarization-3.1.
Run the Command-Line Tool
```
audio-scribe --audio path/to/audio.wav
```
On the first run, you’ll be prompted for your Hugging Face token if you haven’t stored one yet.
Watch the Progress Bar
- The tool displays a progress bar for each diarized speaker turn, along with real-time CPU, GPU, and memory usage.

Usage

Below is a summary of the main command-line options:

usage: audio-scribe [options]

Audio Transcription (Audio Scribe) Pipeline using Whisper + Pyannote, with optional progress bar.

optional arguments:
  --audio PATH           Path to the audio file to transcribe.
  --token TOKEN          HuggingFace API token. Overrides any saved token.
  --output PATH          Path to the output directory for transcripts and temporary files.
  --delete-token         Delete any stored Hugging Face token and exit.
  --show-warnings        Enable user warnings (e.g., from pyannote.audio). Disabled by default.
  --whisper-model MODEL  Specify the Whisper model to use (default: 'base.en').

Examples:

Basic Transcription
```
audio-scribe --audio meeting.wav
```

Specify a Different Whisper Model

audio-scribe --audio webinar.mp3 --whisper-model small

Delete a Stored Token
```
audio-scribe --delete-token
```

Show Internal Warnings

audio-scribe --audio session.wav --show-warnings

Tab-Completion

audio-scribe
# When prompted for an audio file path, press Tab to autocomplete

Dependencies

Core Libraries

Optional for Extended Functionality

alive-progress – Real-time progress bar
psutil – CPU/memory usage
GPUtil – GPU usage
pyreadline3 (for Windows tab-completion)

Sample `requirements.txt`

Below is a typical requirements.txt you can place in your repository:

torch>=1.9
openai-whisper
pyannote.audio
pytorch-lightning
cryptography
keyring
alive-progress
psutil
GPUtil
pyreadline3; sys_platform == "win32"

Note:

pyreadline3 is appended with a PEP 508 marker (; sys_platform == "win32") so it only installs on Windows.

For GPU support, ensure you install a compatible PyTorch version with CUDA.

Contributing

We welcome contributions to Audio Scribe!

Fork the repository and clone your fork.
Create a new branch for your feature or bugfix.
Implement your changes, ensuring code is well-documented and follows best practices.
Open a pull request, detailing the changes you’ve made.

Please read any available guidelines or templates in our repository (such as CONTRIBUTING.md or CODE_OF_CONDUCT.md) before submitting.

License

This project is licensed under the Apache License 2.0.

Copyright 2025 Gurasis Osahan

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

Thank you for using Audio Scribe!
For questions or feedback, please open a GitHub issue or contact the maintainers.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.6

Jun 10, 2025

0.1.5

Jan 20, 2025

0.1.4

Jan 18, 2025

0.1.3

Jan 16, 2025

This version

0.1.2

Jan 16, 2025

0.1.1

Jan 16, 2025

0.1.0

Jan 15, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

audio_scribe-0.1.2.tar.gz (19.9 kB view details)

Uploaded Jan 16, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

audio_scribe-0.1.2-py3-none-any.whl (12.9 kB view details)

Uploaded Jan 16, 2025 Python 3

File details

Details for the file audio_scribe-0.1.2.tar.gz.

File metadata

Download URL: audio_scribe-0.1.2.tar.gz
Upload date: Jan 16, 2025
Size: 19.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.0.1 CPython/3.10.15

File hashes

Hashes for audio_scribe-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`44ea203960362f2d93cfc44778662edb9bdd15c70e6cf9a243893d7744343075`
MD5	`160a2f12d69dc27e2a90bce3f54037e2`
BLAKE2b-256	`073bf2a9a040c9e5a2fe471bd5592868a3fb9e58caa78ab9759da49f701a2ad9`

See more details on using hashes here.

File details

Details for the file audio_scribe-0.1.2-py3-none-any.whl.

File metadata

Download URL: audio_scribe-0.1.2-py3-none-any.whl
Upload date: Jan 16, 2025
Size: 12.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.0.1 CPython/3.10.15

File hashes

Hashes for audio_scribe-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a49e90b593252bc42b7abd593b7a1f8a1de58286685925320765d4d3076e38d7`
MD5	`36e764e4b3a5b8c59ac7f032a6a69143`
BLAKE2b-256	`b5910153f0a60216a65a57e320b847e2c1ba3f4cbc917b957a95cf0f7d5bf6b6`

See more details on using hashes here.

audio-scribe 0.1.2

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

Audio Scribe

Overview

Table of Contents

Features

Installation

Installing from PyPI

Installing from GitHub

Quick Start

Usage

Dependencies

Sample requirements.txt

Contributing

License

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Sample `requirements.txt`