Multilingual Text Tooling around Indian Languages
Project description
ilmulti
This repository houses tooling used to create the models on the leaderboard of WAT-Tasks. We provide wrappers to models which are trained via pytorch/fairseq to translate. Installation and usage intructions are provided below.
-
Training: We use a separate fork of pytorch/fairseq at jerinphilip/fairseq-ilmt for training to optimize for our cluster and to plug and play data easily.
-
Pretrained Models and Other Resources: preon.iiit.ac.in/~jerin/bhasha
Installation
The code is tested to work with the fairseq-fork which is branched from v0.7.2 and torch version 1.0.0.
# --user is optional
python3 -m pip install -r requirements.txt --user
python3 setup.py install --user
Downloading Models: The script
scripts/download-and-setup-models.sh
downloads the model and dictionary files required for running
examples/mm_all.py
. Which models to download
can be configured in the script.
A working example using the wrappers in this code can be found in this Colab Notebook.
Usage
from ilmulti.translator import from_pretrained
translator = from_pretrained(tag='mm-all')
sample = translator("The quick brown fox jumps over the lazy dog", tgt_lang='hi')
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file ilmulti-0.0.1.tar.gz
.
File metadata
- Download URL: ilmulti-0.0.1.tar.gz
- Upload date:
- Size: 6.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.6.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.6.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | cd45566ae4804688da8f1b708fc4f42389bf36ca9aa0a420a9eb6a2c0a4da4b0 |
|
MD5 | cb357653636aa3b4ef5e60c63352bfdb |
|
BLAKE2b-256 | cfa4b92f37d6127cbac9d64d87f657f97da328fbd018d681101eb2c36f7cf37c |