An open-source framework for zero-shot multimodal machine translation inference
Project description
ZeroMMT
Model weights
ZeroMMT-600M | ZeroMMT-1.3B | ZeroMMT-3.3B
This package is intended to perform inference with the ZeroMMT model. ZeroMMT is a zero-shot multilingual multimodal machine translation system trained only on English text-image pairs. It starts from a pretrained NLLB (more info here) and adapts it using lightweight modules (adapters & visual projector) while keeping original weights frozen during training. It is trained using visually conditioned masked language modeling and KL divergence between original MT outputs and new MMT ones. ZeroMMT is available in 3 sizes: 600M, 1.3B and 3.3B. The largest model shows state-of-the-art performances on CoMMuTE, benchmark intended to evaluate abilities of multimodal translation systems to exploit image information to disambiguate the English sentence to be translated. ZeroMMT is multilingual and available for English-to-{Arabic,Chinese,Czech,German,French,Russian}.
If you use this package or like our work, please cite:
@misc{futeral2024zeroshotmultimodalmachinetranslation,
title={Towards Zero-Shot Multimodal Machine Translation},
author={Matthieu Futeral and Cordelia Schmid and Benoît Sagot and Rachel Bawden},
year={2024},
eprint={2407.13579},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2407.13579},
}
Installation
pip install zerommt
Example
without cfg
import requests
from PIL import Image
import torch
from zerommt import create_model
model = create_model(model_path="matthieufp/ZeroMMT-600M",
enable_cfg=False)
model.eval()
image = Image.open(
requests.get(
"http://images.cocodataset.org/val2017/000000002153.jpg", stream=True
).raw
)
src_text = "He's got a bat in his hands."
src_lang = "eng_Latn"
tgt_lang = "fra_Latn"
# Compute cross-entropy loss given translation
tgt_text = "Il a une batte dans ses mains."
with torch.inference_mode():
loss = model(imgs=[image],
src_text=[src_text],
src_lang=src_lang,
tgt_text=[tgt_text],
tgt_lang=tgt_lang,
output_loss=True)
print(loss)
# Generate translation with beam search
beam_size = 4
image2 = Image.open(
requests.get(
"https://zupimages.net/up/24/29/7r3s.jpg", stream=True
).raw
)
with torch.inference_mode():
generated = model.generate(imgs=[image, image2],
src_text=[src_text, src_text],
src_lang=src_lang,
tgt_lang=tgt_lang,
beam_size=beam_size)
translation = model.tokenizer.batch_decode(generated, skip_special_tokens=True)
print(translation)
with cfg (WARNING: enabling cfg will require approximately twice as much memory!)
import requests
from PIL import Image
import torch
from zerommt import create_model
model = create_model(model_path="matthieufp/ZeroMMT-600M",
enable_cfg=True)
model.eval()
image = Image.open(
requests.get(
"http://images.cocodataset.org/val2017/000000002153.jpg", stream=True
).raw
)
src_text = "He's got a bat in his hands."
src_lang = "eng_Latn"
tgt_lang = "fra_Latn"
# Compute cross-entropy loss given translation
tgt_text = "Il a une batte dans ses mains."
cfg_value = 1.25
with torch.inference_mode():
loss = model(imgs=[image],
src_text=[src_text],
src_lang=src_lang,
tgt_text=[tgt_text],
tgt_lang=tgt_lang,
output_loss=True,
cfg_value=cfg_value)
print(loss)
# Generate translation with beam search and cfg
beam_size = 4
with torch.inference_mode():
generated = model.generate(imgs=[image],
src_text=[src_text],
src_lang=src_lang,
tgt_lang=tgt_lang,
beam_size=beam_size,
cfg_value=cfg_value)
translation = model.tokenizer.batch_decode(generated)[0]
print(translation)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file zerommt-0.1.0.tar.gz
.
File metadata
- Download URL: zerommt-0.1.0.tar.gz
- Upload date:
- Size: 168.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1dd5eb1ab3f09238a1c2d424651ed2b9fc6ba9b95fb744a0374bbadbab8f9552 |
|
MD5 | abf49a0c8876911083630e4743d27433 |
|
BLAKE2b-256 | 8bbb66d744128afac7fb24de82bceb82c68e017f15349dcdf9a113fd9249ce8f |
File details
Details for the file zerommt-0.1.0-py2.py3-none-any.whl
.
File metadata
- Download URL: zerommt-0.1.0-py2.py3-none-any.whl
- Upload date:
- Size: 251.2 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c13e63187a1b27dd95f79d8efb294040bc41a81c24a546439a2219532f740038 |
|
MD5 | 89f1586854e40b748a22e8a04be71fff |
|
BLAKE2b-256 | 1676f421827b1c2cff134d7a58cf81f5ef3e00fcb477f98dec273687d53052cd |