Skip to main content

A simple Python package to easily use Meta's Massively Multilingual Speech (MMS) project

Project description

EasyMMS

A simple Python package to easily use Meta's Massively Multilingual Speech (MMS) project.

PyPi version wheels Open In Colab

The current MMS code is using subprocess to call another Python script, which is not very convenient to use, and might lead to several issues. This package is created to address those problems and to wrap up the project in an API to easily integrate it with other projects.

Installation

  1. You will need ffmpeg for audio processing
  2. Install easymms from Pypi
pip install easymms

or from source

pip install git+https://github.com/abdeladim-s/easymms
  1. If you want to use the Alignment model:
  • you will need perl to use uroman. Check the perl website for installation instructions on different platforms.
  • You will need a nightly version of torchaudio:
pip install -U --pre torchaudio --index-url https://download.pytorch.org/whl/nightly/cu118
  • You might need sox as well.
  1. Fairseq has not included the MMS project yet in the released PYPI version, so until the next release, you will need to install fairseq from source:
pip uninstall fairseq && pip install git+https://github.com/facebookresearch/fairseq

Quickstart

:warning: There is an issue with fairseq when running the code in interactive environments like Jupyter notebooks.
Please use normal Python files or use the colab notebook provided above.

ASR

You will need first to download the model weights, you can find and download all the supported models from here.

from easymms.models.asr import ASRModel

asr = ASRModel(model='/path/to/mms/model')
files = ['path/to/media_file_1', 'path/to/media_file_2']
transcriptions = asr.transcribe(files, lang='eng', align=False)
for i, transcription in enumerate(transcriptions):
    print(f">>> file {files[i]}")
    print(transcription)

ASR with Alignment

from easymms.models.asr import ASRModel

asr = ASRModel(model='/path/to/mms/model')
files = ['path/to/media_file_1', 'path/to/media_file_2']
transcriptions = asr.transcribe(files, lang='eng', align=True)
for i, transcription in enumerate(transcriptions):
    print(f">>> file {files[i]}")
    for segment in transcription:
        print(f"{segment['start_time']} -> {segment['end_time']}: {segment['text']}")
    print("----")

Alignment model only

from easymms.models.alignment import AlignmentModel
    
align_model = AlignmentModel()
transcriptions = align_model.align('path/to/wav_file.wav', 
                                   transcript=["segment 1", "segment 2"],
                                   lang='eng')
for transcription in transcriptions:
    for segment in transcription:
        print(f"{segment['start_time']} -> {segment['end_time']}: {segment['text']}")

TTS

from easymms.models.tts import TTSModel

tts = TTSModel('eng')
res = tts.synthesize("This is a simple example")
tts.save(res)

LID

Coming Soon

API reference

You can check the API reference documentation for more details.

License

Since the models are released under the CC-BY-NC 4.0 license. This project is following the same License.

Disclaimer & Credits

This project is not endorsed or certified by Meta AI and is just simplifying the use of the MMS project.
All credit goes to the authors and to Meta for open sourcing the models.
Please check their paper Scaling Speech Technology to 1000+ languages and their blog post.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

easymms-0.1.6.tar.gz (20.5 kB view details)

Uploaded Source

File details

Details for the file easymms-0.1.6.tar.gz.

File metadata

  • Download URL: easymms-0.1.6.tar.gz
  • Upload date:
  • Size: 20.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.3

File hashes

Hashes for easymms-0.1.6.tar.gz
Algorithm Hash digest
SHA256 a6115ef5a47348a6c96c6787d49d4596317d71c0608bf4c669164baaa2e36141
MD5 180a69d6c18cf2c3a9d32bada1e642f5
BLAKE2b-256 f2c7ac6dbf524746d657d9fd3f7bbcf34e5f16d34b0573f3a30db18610857812

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page