Skip to main content

A framework for fast fine-tuning and API endpoint deployment of Whisper model specifically developed to accelerate Automatic Speech Recognition(ASR) for African Languages.

Project description

African Whisper: ASR for African Languages

Twitter Last commit License

Framework for seamless fine-tuning and deploying Whisper Model developed to advance Automatic Speech Recognition (ASR): translation and transcription capabilities for African languages.

Features

  • 🔧 Fine-Tuning: Fine-tune the Whisper model on any audio dataset from Huggingface, e.g., Mozilla's Common Voice datasets.

  • 📊 Metrics Monitoring: View training run metrics on Wandb.

  • 🐳 Production Deployment: Seamlessly containerize and deploy the model inference endpoint for real-world applications.

  • 🚀 Model Optimization: Utilize CTranslate2 for efficient model optimization, ensuring faster inference times.

  • 📝 Word-Level Transcriptions: Produce detailed word-level transcriptions and translations, complete with timestamps.

  • 🎙️ Multi-Speaker Diarization: Perform speaker identification and separation in multi-speaker audio using diarization techniques.

  • 🔍 Alignment Precision: Improve transcription and translation accuracy by aligning outputs with Wav2vec models.

  • 🛡️ Reduced Hallucination: Leverage Voice Activity Detection (VAD) to minimize hallucination and improve transcription clarity.


The framework implements the following papers:
  1. Robust Speech Recognition via Large-Scale Weak Supervision : Speech processing systems trained to predict large amounts of transcripts of audio on the internet scaled to 680,000 hours of multilingual and multitask supervision.

  2. WhisperX: Time-Accurate Speech Transcription of Long-Form Audio for time-accurate speech recognition with word-level timestamps.

  3. Pyannote.audio: Neural building blocks for speaker diarization for advanced speaker diarization capabilities.

  4. Efficient and High-Quality Neural Machine Translation with OpenNMT: Efficient neural machine translation and model acceleration.

For more details, you can refer to the Whisper ASR model paper.

Documentation

Refer to the Documentation to get started

Contributing

Contributions are welcome and encouraged.

Before contributing, please take a moment to review our Contribution Guidelines for important information on how to contribute to this project.

If you're unsure about anything or need assistance, don't hesitate to reach out to us or open an issue to discuss your ideas.

We look forward to your contributions!

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contact

For any enquiries, please reach out to me through keviinkibe@gmail.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

africanwhisper-0.9.19.tar.gz (59.2 kB view details)

Uploaded Source

Built Distribution

africanwhisper-0.9.19-py3-none-any.whl (63.9 kB view details)

Uploaded Python 3

File details

Details for the file africanwhisper-0.9.19.tar.gz.

File metadata

  • Download URL: africanwhisper-0.9.19.tar.gz
  • Upload date:
  • Size: 59.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for africanwhisper-0.9.19.tar.gz
Algorithm Hash digest
SHA256 3387c110b823e5b697e794bcbeee3531d5794cd1498a55f5b3c390c21c24ad20
MD5 4720d1e51f7a70d81f0818ba8bd6d201
BLAKE2b-256 ee7c0ffd8bec548d41c6c28e66d2ba4a8da3f2755fd5bad472dae34435dd3b0c

See more details on using hashes here.

File details

Details for the file africanwhisper-0.9.19-py3-none-any.whl.

File metadata

File hashes

Hashes for africanwhisper-0.9.19-py3-none-any.whl
Algorithm Hash digest
SHA256 39c042a1aa74fa02968a33f343d879ff264acb1f15dd99b5839d5f662cad0916
MD5 1642b362061022cb8a60fa919b80688e
BLAKE2b-256 d548e5b80e065cbb1dbeb51e64f38cd6b4c39293b8b05d80071b81877c992e39

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page