Skip to main content

Neural machine translation and sequence learning using TensorFlow

Project description

CI codecov PyPI version Documentation Gitter Forum

OpenNMT-tf

OpenNMT-tf is a general purpose sequence learning toolkit using TensorFlow 2. While neural machine translation is the main target task, it has been designed to more generally support:

  • sequence to sequence mapping
  • sequence tagging
  • sequence classification
  • language modeling

The project is production-oriented and comes with backward compatibility guarantees.

Key features

Modular model architecture

Models are described with code to allow training custom architectures and overriding default behavior. For example, the following instance defines a sequence to sequence model with 2 concatenated input features, a self-attentional encoder, and an attentional RNN decoder sharing its input and output embeddings:

opennmt.models.SequenceToSequence(
    source_inputter=opennmt.inputters.ParallelInputter(
        [
            opennmt.inputters.WordEmbedder(embedding_size=256),
            opennmt.inputters.WordEmbedder(embedding_size=256),
        ],
        reducer=opennmt.layers.ConcatReducer(axis=-1),
    ),
    target_inputter=opennmt.inputters.WordEmbedder(embedding_size=512),
    encoder=opennmt.encoders.SelfAttentionEncoder(num_layers=6),
    decoder=opennmt.decoders.AttentionalRNNDecoder(
        num_layers=4,
        num_units=512,
        attention_mechanism_class=tfa.seq2seq.LuongAttention,
    ),
    share_embeddings=opennmt.models.EmbeddingsSharingLevel.TARGET,
)

The opennmt package exposes other building blocks that can be used to design:

Standard models such as the Transformer are defined in a model catalog and can be used without additional configuration.

Find more information about model configuration in the documentation.

Full TensorFlow 2 integration

OpenNMT-tf is fully integrated in the TensorFlow 2 ecosystem:

Compatibility with CTranslate2

CTranslate2 is an optimized inference engine for OpenNMT models featuring fast CPU and GPU execution, model quantization, parallel translations, dynamic memory usage, interactive decoding, and more! OpenNMT-tf can automatically export models to be used in CTranslate2.

Dynamic data pipeline

OpenNMT-tf does not require to compile the data before the training. Instead, it can directly read text files and preprocess the data when needed by the training. This allows on-the-fly tokenization and data augmentation by injecting random noise.

Model fine-tuning

OpenNMT-tf supports model fine-tuning workflows:

  • Model weights can be transferred to new word vocabularies, e.g. to inject domain terminology before fine-tuning on in-domain data
  • Contrastive learning to reduce word omission errors

Source-target alignment

Sequence to sequence models can be trained with guided alignment and alignment information are returned as part of the translation API.


OpenNMT-tf also implements most of the techniques commonly used to train and evaluate sequence models, such as:

  • automatic evaluation during the training
  • multiple decoding strategy: greedy search, beam search, random sampling
  • N-best rescoring
  • gradient accumulation
  • scheduled sampling
  • checkpoint averaging
  • ... and more!

See the documentation to learn how to use these features.

Usage

OpenNMT-tf requires:

  • Python 3.7 or above
  • TensorFlow 2.6, 2.7, 2.8, 2.9, 2.10, 2.11, 2.12, or 2.13

We recommend installing it with pip:

pip install --upgrade pip
pip install OpenNMT-tf

See the documentation for more information.

Command line

OpenNMT-tf comes with several command line utilities to prepare data, train, and evaluate models.

For all tasks involving a model execution, OpenNMT-tf uses a unique entrypoint: onmt-main. A typical OpenNMT-tf run consists of 3 elements:

  • the model type
  • the parameters described in a YAML file
  • the run type such as train, eval, infer, export, score, average_checkpoints, or update_vocab

that are passed to the main script:

onmt-main --model_type <model> --config <config_file.yml> --auto_config <run_type> <run_options>

For more information and examples on how to use OpenNMT-tf, please visit our documentation.

Library

OpenNMT-tf also exposes well-defined and stable APIs, from high-level training utilities to low-level model layers and dataset transformations.

For example, the Runner class can be used to train and evaluate models with few lines of code:

import opennmt

config = {
    "model_dir": "/data/wmt-ende/checkpoints/",
    "data": {
        "source_vocabulary": "/data/wmt-ende/joint-vocab.txt",
        "target_vocabulary": "/data/wmt-ende/joint-vocab.txt",
        "train_features_file": "/data/wmt-ende/train.en",
        "train_labels_file": "/data/wmt-ende/train.de",
        "eval_features_file": "/data/wmt-ende/valid.en",
        "eval_labels_file": "/data/wmt-ende/valid.de",
    }
}

model = opennmt.models.TransformerBase()
runner = opennmt.Runner(model, config, auto_config=True)
runner.train(num_devices=2, with_eval=True)

Here is another example using OpenNMT-tf to run efficient beam search with a self-attentional decoder:

decoder = opennmt.decoders.SelfAttentionDecoder(num_layers=6, vocab_size=32000)

initial_state = decoder.initial_state(
    memory=memory, memory_sequence_length=memory_sequence_length
)

batch_size = tf.shape(memory)[0]
start_ids = tf.fill([batch_size], opennmt.START_OF_SENTENCE_ID)

decoding_result = decoder.dynamic_decode(
    target_embedding,
    start_ids=start_ids,
    initial_state=initial_state,
    decoding_strategy=opennmt.utils.BeamSearch(4),
)

More examples using OpenNMT-tf as a library can be found online:

  • The directory examples/library contains additional examples that use OpenNMT-tf as a library
  • nmt-wizard-docker uses the high-level opennmt.Runner API to wrap OpenNMT-tf with a custom interface for training, translating, and serving

For a complete overview of the APIs, see the package documentation.

Additional resources

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

OpenNMT-tf-2.32.0.tar.gz (133.0 kB view details)

Uploaded Source

Built Distribution

OpenNMT_tf-2.32.0-py3-none-any.whl (162.0 kB view details)

Uploaded Python 3

File details

Details for the file OpenNMT-tf-2.32.0.tar.gz.

File metadata

  • Download URL: OpenNMT-tf-2.32.0.tar.gz
  • Upload date:
  • Size: 133.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for OpenNMT-tf-2.32.0.tar.gz
Algorithm Hash digest
SHA256 dcc7d80046bf6e94f3d7f2e477d1c89c7c57ef6b99f7d9b92235341442833c79
MD5 1c360d336c32891c0cd6e303f5707e64
BLAKE2b-256 39a7aa003550a746f2c843718b55f21a6697a61b2c9e10587c5ba0fa4b62708b

See more details on using hashes here.

File details

Details for the file OpenNMT_tf-2.32.0-py3-none-any.whl.

File metadata

  • Download URL: OpenNMT_tf-2.32.0-py3-none-any.whl
  • Upload date:
  • Size: 162.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for OpenNMT_tf-2.32.0-py3-none-any.whl
Algorithm Hash digest
SHA256 41b4221777fed15a84247123eebba6a1f6becef824a85af33248416a6d331cd5
MD5 994570a8a71dd5fd184a53fb2bb366d9
BLAKE2b-256 ff2db655f288685a9c6b62384d97d2b9132861980e84f8c15af0de48d8a3e5f9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page