Skip to main content

Whisper with speaker diarization

Project description

Whisper-Run

Whisper-Run is a pip CLI tool for processing audio files using Whisper models with speaker diarization capabilities. The tool allows you to process audio files, select models for audio processing, and save the results in JSON format.

It uses the OpenAI-Whisper model implementation from OpenAI Whisper, based on the ctranslate2 library from faster-whisper, and pyannote's speaker-diarization-3.1. Check their documentation if needed.

Installation

To install Whisper-Run, run the following command:

pip install whisper-run

Usage

You can call Whisper-Run from the command line using the following syntax:

whisper-run --file_path=<file_path>

Example

To process an audio file using the CPU and a specific file path:

whisper-run --device=cpu --file_path=your_file_path

When you run the command, you'll be prompted to select a model for audio processing:

[?] Select a model for audio processing:
 > distil-large-v3
   distil-large-v2
   large-v3
   large-v2
   large
   medium
   small
   base
   tiny

Flags

  • --device: Specify the device to use for processing (e.g., cpu or cuda).
  • --file_path: Specify the path to the audio file you want to process.
  • --hf_auth_token: Optional. Pass the Hugging Face Auth Token or set the HF_AUTH_TOKEN environment variable.

Programmatic Usage

You can also use Whisper-Run programmatically in your Python scripts. Below is a basic usage example demonstrating how to use the Whisper-Run library:

Example Script

from whisper_run import AudioProcessor

def main():
    processor = AudioProcessor(file_path="your_file_path",
                               device="cpu",
                               model_name="large-v3"
                               )
    processor.process()

if __name__ == "__main__":
    main()

Contributing

Contributions are welcome! Please open an issue or submit a pull request on GitHub.

License

This project is licensed under the Apache 2.0 License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

whisper_run-1.2.0.tar.gz (6.2 MB view details)

Uploaded Source

Built Distribution

whisper_run-1.2.0-py3-none-any.whl (5.6 MB view details)

Uploaded Python 3

File details

Details for the file whisper_run-1.2.0.tar.gz.

File metadata

  • Download URL: whisper_run-1.2.0.tar.gz
  • Upload date:
  • Size: 6.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.10.13

File hashes

Hashes for whisper_run-1.2.0.tar.gz
Algorithm Hash digest
SHA256 5d404ec65095df32b7faf88f4fb746012629c3674786ad5dd9449c4d3d2ed6c6
MD5 bb8ccae0f6c938f42bb40dd6336a6344
BLAKE2b-256 9f2b52a20dbc574d3932a0d9770fa59245ddab59c0d1a79ecd12993375ec3575

See more details on using hashes here.

File details

Details for the file whisper_run-1.2.0-py3-none-any.whl.

File metadata

  • Download URL: whisper_run-1.2.0-py3-none-any.whl
  • Upload date:
  • Size: 5.6 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.10.13

File hashes

Hashes for whisper_run-1.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b851bc412b8d586ec1a799d9067e23d94155a3300202ce7a07b8ec8d0e99fea5
MD5 2c97f85a0b578208a389edb6920cbe82
BLAKE2b-256 2536167a39ff5ea6d008153501825923440062d9d96b6a539f1d41eb3e79cf96

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page