Skip to main content

A python implementation of OpenNMT

Project description

OpenNMT-py: Open-Source Neural Machine Translation and (Large) Language Models

Build Status Documentation Gitter Forum

OpenNMT-py is the PyTorch version of the OpenNMT project, an open-source (MIT) neural machine translation (and beyond!) framework. It is designed to be research friendly to try out new ideas in translation, language modeling, summarization, and many other NLP tasks. Some companies have proven the code to be production ready.

We love contributions! Please look at issues marked with the contributions welcome tag.

Before raising an issue, make sure you read the requirements and the Full Documentation examples.

Unless there is a bug, please use the Forum or Gitter to ask questions.


For beginners:

There is a step-by-step and explained tuto (Thanks to Yasmin Moslem): Tutorial

Please try to read and/or follow before raising newbies issues.

Otherwise you can just have a look at the Quickstart steps


New:

  • You will need Pytorch v2 preferably v2.1 which fixes some scaled_dot_product_attention issues
  • LLM support with converters for: Llama (+ Mistral), OpenLlama, Redpajama, MPT-7B, Falcon.
  • Support for 8bit and 4bit quantization along with LoRA adapters, with or without checkpointing.
  • You can finetune 7B and 13B models on a single RTX 24GB with 4-bit quantization.
  • Inference can be forced in 4/8bit using the same layer quantization as in finetuning.
  • Tensor parallelism when the model does not fit on one GPU's memory (both training and inference)
  • Once your model is finetuned you can run inference either with OpenNMT-py or faster with CTranslate2.
  • MMLU evaluation script, see results here

For all usecases including NMT, you can now use Multiquery instead of Multihead attention (faster at training and inference) and remove biases from all Linear (QKV as well as FeedForward modules).

If you used previous versions of OpenNMT-py, you can check the Changelog or the Breaking Changes


Tutorials:

  • How to replicate Vicuna with a 7B or 13B llama (or Open llama, MPT-7B, Redpajama) Language Model: Tuto Vicuna
  • How to finetune NLLB-200 with your dataset: Tuto Finetune NLLB-200
  • How to create a simple OpenNMT-py REST Server: Tuto REST
  • How to create a simple Web Interface: Tuto Streamlit
  • Replicate the WMT17 en-de experiment: WMT17 ENDE

Setup

OpenNMT-py requires:

  • Python >= 3.8
  • PyTorch >= 2.0 <2.2

Install OpenNMT-py from pip:

pip install OpenNMT-py

or from the sources:

git clone https://github.com/OpenNMT/OpenNMT-py.git
cd OpenNMT-py
pip install -e .

Note: if you encounter a MemoryError during installation, try to use pip with --no-cache-dir.

(Optional) Some advanced features (e.g. working pretrained models or specific transforms) require extra packages, you can install them with:

pip install -r requirements.opt.txt

Manual installation of some dependencies

Apex is highly recommended to have fast performance (especially the legacy fusedadam optimizer and FusedRMSNorm)

git clone https://github.com/NVIDIA/apex
cd apex
pip3 install -v --no-build-isolation --config-settings --build-option="--cpp_ext --cuda_ext --deprecated_fused_adam --xentropy --fast_multihead_attn" ./
cd ..

Flash attention:

As of Oct. 2023 flash attention 1 has been upstreamed to pytorch v2 but it is recommended to use flash attention 2 with v2.3.1 for sliding window attention support.

When using regular position_encoding=True or Rotary with max_relative_positions=-1 OpenNMT-py will try to use an optimized dot-product path.

if you want to use flash attention then you need to manually install it first:

pip install flash-attn --no-build-isolation

if flash attention 2 is not installed, then we will use F.scaled_dot_product_attention from pytorch 2.x

When using max_relative_positions > 0 or Alibi max_relative_positions=-2 OpenNMT-py will use its legacy code for matrix multiplications.

flash attention and F.scaled_dot_product_attention are a bit faster and saves some GPU memory.

Documentation & FAQs

Full HTML Documentation

FAQs

Acknowledgements

OpenNMT-py is run as a collaborative open-source project. Project was incubated by Systran and Harvard NLP in 2016 in Lua and ported to Pytorch in 2017.

Current maintainers (since 2018):

François Hernandez Vincent Nguyen (Seedfall)

Citation

If you are using OpenNMT-py for academic work, please cite the initial system demonstration paper published in ACL 2017:

@misc{klein2018opennmt,
      title={OpenNMT: Neural Machine Translation Toolkit}, 
      author={Guillaume Klein and Yoon Kim and Yuntian Deng and Vincent Nguyen and Jean Senellart and Alexander M. Rush},
      year={2018},
      eprint={1805.11462},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

OpenNMT-py-3.4.3.tar.gz (213.6 kB view details)

Uploaded Source

Built Distribution

OpenNMT_py-3.4.3-py3-none-any.whl (257.3 kB view details)

Uploaded Python 3

File details

Details for the file OpenNMT-py-3.4.3.tar.gz.

File metadata

  • Download URL: OpenNMT-py-3.4.3.tar.gz
  • Upload date:
  • Size: 213.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.18

File hashes

Hashes for OpenNMT-py-3.4.3.tar.gz
Algorithm Hash digest
SHA256 93e25f3867b336a529dd668accb7700272124c7a5dec9c26436498005ed49a0b
MD5 b3760b637ec1ac33d1b78c4bafdf4cbc
BLAKE2b-256 ef854635137b75439bfed0483f324a410a073f150680ede913b18e94708677a8

See more details on using hashes here.

File details

Details for the file OpenNMT_py-3.4.3-py3-none-any.whl.

File metadata

  • Download URL: OpenNMT_py-3.4.3-py3-none-any.whl
  • Upload date:
  • Size: 257.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.18

File hashes

Hashes for OpenNMT_py-3.4.3-py3-none-any.whl
Algorithm Hash digest
SHA256 6b847d32c7d6b8143f8de2a50c7fee89613bfc1b94d6cdf8d86525d4bacdd57f
MD5 c4283e7c0625324b54e89275d3fa4947
BLAKE2b-256 8700c7310a60d521537574fb729bc9af4a542951646cbb683d173b1cc0afe0c1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page