Skip to main content

No project description provided

Project description

wow_ai_mms

A simple Python package to easily use Meta's Massively Multilingual Speech (MMS) project.

PyPi version wheels

The current MMS code is using subprocess to call another Python script, which is not very convenient to use, and might lead to several issues. This package is created to address those problems and to wrap up the project in an API to easily integrate it with other projects.

Installation

  1. You will need ffmpeg for audio processing
  2. Install wow-ai-mms from Pypi
pip install wow-ai-mms

or from source

pip install git+https://github.com/wow-ai-ml/wow-ai-mms
  1. If you want to use the Alignment model:
  • you will need perl to use uroman. Check the perl website for installation instructions on different platforms.
  • You will need a nightly version of torchaudio:
pip install -U --pre torchaudio --index-url https://download.pytorch.org/whl/nightly/cu118
  • You might need sox as well.
  1. Fairseq has not included the MMS project yet in the released PYPI version, so until the next release, you will need to install fairseq from source:
pip uninstall fairseq && pip install git+https://github.com/facebookresearch/fairseq

Quickstart

:warning: There is an issue with fairseq when running the code in interactive environments like Jupyter notebooks.
Please use normal Python files or use the colab notebook provided above.

ASR

You will need first to download the model weights, you can find and download all the supported models from here.

from wow_ai_mms.models.asr import ASRModel

asr = ASRModel(model='/path/to/mms/model')
files = ['path/to/media_file_1', 'path/to/media_file_2']
transcriptions = asr.transcribe(files, lang='eng', align=False)
for i, transcription in enumerate(transcriptions):
    print(f">>> file {files[i]}")
    print(transcription)

ASR with Alignment

from wow_ai_mms.models.asr import ASRModel

asr = ASRModel(model='/path/to/mms/model')
files = ['path/to/media_file_1', 'path/to/media_file_2']
transcriptions = asr.transcribe(files, lang='eng', align=True)
for i, transcription in enumerate(transcriptions):
    print(f">>> file {files[i]}")
    for segment in transcription:
        print(f"{segment['start_time']} -> {segment['end_time']}: {segment['text']}")
    print("----")

Alignment model only

from wow_ai_mms.models.alignment import AlignmentModel
    
align_model = AlignmentModel()
transcriptions = align_model.align('path/to/wav_file.wav', 
                                   transcript=["segment 1", "segment 2"],
                                   lang='eng')
for transcription in transcriptions:
    for segment in transcription:
        print(f"{segment['start_time']} -> {segment['end_time']}: {segment['text']}")

TTS

from wow_ai_mms.models.tts import TTSModel

tts = TTSModel('eng')
res = tts.synthesize("This is a simple example")
tts.save(res)

LID

Coming Soon

API reference

You can check the API reference documentation for more details.

License

Since the models are released under the CC-BY-NC 4.0 license. This project is following the same License.

Disclaimer & Credits

This project is not endorsed or certified by Meta AI and is just simplifying the use of the MMS project.
All credit goes to the authors and to Meta for open sourcing the models.
Please check their paper Scaling Speech Technology to 1000+ languages and their blog post.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wow_ai_mms-0.2.0.tar.gz (8.8 MB view details)

Uploaded Source

Built Distribution

wow_ai_mms-0.2.0-py3-none-any.whl (9.1 MB view details)

Uploaded Python 3

File details

Details for the file wow_ai_mms-0.2.0.tar.gz.

File metadata

  • Download URL: wow_ai_mms-0.2.0.tar.gz
  • Upload date:
  • Size: 8.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.6

File hashes

Hashes for wow_ai_mms-0.2.0.tar.gz
Algorithm Hash digest
SHA256 aa8cbdc5873a19ff6a64f527c8007201e7b8207d31220ad27c8c82c0e57cf4cb
MD5 a4e123999c5ca6fbd84a55cd9108e36a
BLAKE2b-256 3b4c1abc0236d84ec8236907782d794c7382d2fa1bb6d3addb3fc02ec6e77f27

See more details on using hashes here.

File details

Details for the file wow_ai_mms-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: wow_ai_mms-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 9.1 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.6

File hashes

Hashes for wow_ai_mms-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 74be62b0cb116e7008a7cd277011966395c75b1bb44d01f863ef5054ee06337c
MD5 dacda11499424e2ebf6d04ae6f9c79e2
BLAKE2b-256 4980ff796b50f5c86952ee039de71b59849116ba8affb7aa3bef37d5c17cb312

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page