Repository with the code for the multilingual and multimodal benchmark MCIF

These details have not been verified by PyPI

Project links

Project description

MCIF - Multimodal Crosslingual Instruction-Following

MCIF Logo

MCIF is a comprehensive benchmark for evaluating multimodal, crosslingual instruction-following systems, which covers 3 modalities (text, speech, and video), 4 languages (English, German, Italian, and Chinese), and 13 tasks (organized into 4 macro-tasks).

A subset of MCIF has been used for the evaluation of the IWSLT 2025 Instruction-Following Shared Task.

📰 News

2025.10.22: 🤗 MCIF test set is released on HuggingFace
2025.10.21: ⭐️ MCIF Evaluation first release

📦 Repository Structure

The evaluation is the core component of this repository. All other components (i.e., dataset construction and baseline inference) are included to ensure full reproducibility and transparency of the evaluation results.

For details on dataset generation or baseline models, please refer to the dedicated READMEs (baselines may require specific dependencies):

🧱 Dataset Construction — scripts and guidelines for creating test sets and references → dataset_build/README.md
🚀 Baselines — inference scripts and outputs for baseline systems → baselines/README.md
📊 Evaluation — scoring and comparison utilities for submitted outputs → README.md

⚙️ Installation

The repository can be installed with pip install -e ..

▶️ Usage

For the evaluation, you can simply run:

mcif_eval -t {short/long} -l {en/de/it/zh} -s model_outputs.xml

where model_outputs.xml contains the outputs of your model for the selected track or context length (short or long) and target language among English (en), German (de), Italian (it) and Chinese (zh).

This will automatically download the reference from the Huggingface repository for the latest MCIF version. If you want to specify a different version, use -v. To run the evaluation without internet access, first download the MICF references and then provide them to mcif_eval with the -r parameter.

The file containing the model outputs to evaluate must be structured as follows:

<?xml version='1.0' encoding='utf-8'?>
<testset name="MCIF" type="output">
  <task track="{short/long}" text_lang="{en/de/it/zh}">
    <sample id="1">{SAMPLE1_CONTENT}</sample>
    <sample id="2">{SAMPLE2_CONTENT}</sample>
   ....
  </task>
</testset>

To ease usability, we provide a helper function (mcif.io.write_output) that automatically formats model predictions into the XML structure required by the MCIF evaluation script. The method takes as input:

samples: a list of mcif.io.OutputSample containing the sample id and its related prediction;
track: the context length or track (short/long);
language: the target language (en/de/it/zh);
output_name: the semantic name of the output (e.g. My model);
output: a path or a byte buffer where the XML file containing all system's outputs, ready for evaluation, is written.

📜 License

MCIF is released under the Apache 2.0 License.

🧩 Citation

If you use MCIF in your research, please cite:

@misc{mcif,
      title={MCIF: Multimodal Crosslingual Instruction-Following Benchmark from Scientific Talks}, 
      author={Sara Papi and Maike Züfle and Marco Gaido and Beatrice Savoldi and Danni Liu and Ioannis Douros and Luisa Bentivogli and Jan Niehues},
      year={2025},
      eprint={2507.19634},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2507.19634}, 
}

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.0

Dec 22, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcif_bench-1.0.tar.gz (19.3 kB view details)

Uploaded Dec 22, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mcif_bench-1.0-py3-none-any.whl (18.4 kB view details)

Uploaded Dec 22, 2025 Python 3

File details

Details for the file mcif_bench-1.0.tar.gz.

File metadata

Download URL: mcif_bench-1.0.tar.gz
Upload date: Dec 22, 2025
Size: 19.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.1

File hashes

Hashes for mcif_bench-1.0.tar.gz
Algorithm	Hash digest
SHA256	`d216e0b6dfe333af3425d4321766b13c7f16705eb3fb23bb12635cb3b7b44958`
MD5	`2e67a2d660621ba2cca354041a231536`
BLAKE2b-256	`52800b56bae6d6c3ef542823c77caf073b1bf7b83b8be11e4381e04d47dee8ea`

See more details on using hashes here.

File details

Details for the file mcif_bench-1.0-py3-none-any.whl.

File metadata

Download URL: mcif_bench-1.0-py3-none-any.whl
Upload date: Dec 22, 2025
Size: 18.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.1

File hashes

Hashes for mcif_bench-1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f70fbce0e471eaa10a4f590b573f80ed94b07ad1b0a0e9b279bc8ef6c3f5f7f9`
MD5	`9e17ea447055393dfac8d55a154a39f4`
BLAKE2b-256	`fe6bce079cd44a82c2b4b8f5fd57b7a270ed3e2ac14c9152e5e1b79c315a0b24`

See more details on using hashes here.

mcif-bench 1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

MCIF - Multimodal Crosslingual Instruction-Following

📰 News

📦 Repository Structure

⚙️ Installation

▶️ Usage

📜 License

🧩 Citation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes