Skip to main content

Massive Text Embedding Benchmark

Project description

MTEB MTEB

Multimodal toolbox for evaluating embeddings and retrieval systems

GitHub release License Downloads

Installation | Usage | Leaderboard | Documentation | Citing

Installation

You can install mteb simply using pip or uv. For more on installation please see the documentation.

pip install mteb

For faster installation, you can also use uv:

uv add mteb

Example Usage

Below we present a simple use-case example. For more information, see the documentation.

import mteb
from sentence_transformers import SentenceTransformer

# Select model
model_name = "sentence-transformers/all-MiniLM-L6-v2"
model = mteb.get_model(model_name) # if the model is not implemented in MTEB it will be eq. to SentenceTransformer(model_name)

# Select tasks
tasks = mteb.get_tasks(tasks=["Banking77Classification.v2"])

# evaluate
results = mteb.evaluate(model, tasks=tasks)

You can also run it using the CLI:

mteb run \
    -m sentence-transformers/all-MiniLM-L6-v2 \
    -t "Banking77Classification.v2" \
    --output-folder results

For more on how to use the CLI check out the related documentation.

Overview

Overview
📈 Leaderboard The interactive leaderboard of the benchmark
Get Started.
🏃 Get Started Overview of how to use mteb
🤖 Defining Models How to use existing model and define custom ones
📋 Selecting tasks How to select tasks, benchmarks, splits etc.
🏭 Running Evaluation How to run the evaluations, including cache management, speeding up evaluations etc.
📊 Loading Results How to load and work with existing model results
Overview.
📋 Tasks Overview of available tasks
📐 Benchmarks Overview of available benchmarks
🤖 Models Overview of available Models
Contributing
🤖 Adding a model How to submit a model to MTEB and to the leaderboard
👩‍💻 Adding a dataset How to add a new task/dataset to MTEB
👩‍💻 Adding a benchmark How to add a new benchmark to MTEB and to the leaderboard
🤝 Contributing How to contribute to MTEB and set it up for development

Citing

MTEB was introduced in "MTEB: Massive Text Embedding Benchmark", and heavily expanded in "MMTEB: Massive Multilingual Text Embedding Benchmark". When using mteb, we recommend that you cite both articles.

Bibtex Citation (click to unfold)
@article{muennighoff2022mteb,
  author = {Muennighoff, Niklas and Tazi, Nouamane and Magne, Loïc and Reimers, Nils},
  title = {MTEB: Massive Text Embedding Benchmark},
  publisher = {arXiv},
  journal={arXiv preprint arXiv:2210.07316},
  year = {2022}
  url = {https://arxiv.org/abs/2210.07316},
  doi = {10.48550/ARXIV.2210.07316},
}

@article{enevoldsen2025mmtebmassivemultilingualtext,
  title={MMTEB: Massive Multilingual Text Embedding Benchmark},
  author={Kenneth Enevoldsen and Isaac Chung and Imene Kerboua and Márton Kardos and Ashwin Mathur and David Stap and Jay Gala and Wissam Siblini and Dominik Krzemiński and Genta Indra Winata and Saba Sturua and Saiteja Utpala and Mathieu Ciancone and Marion Schaeffer and Gabriel Sequeira and Diganta Misra and Shreeya Dhakal and Jonathan Rystrøm and Roman Solomatin and Ömer Çağatan and Akash Kundu and Martin Bernstorff and Shitao Xiao and Akshita Sukhlecha and Bhavish Pahwa and Rafał Poświata and Kranthi Kiran GV and Shawon Ashraf and Daniel Auras and Björn Plüster and Jan Philipp Harries and Loïc Magne and Isabelle Mohr and Mariya Hendriksen and Dawei Zhu and Hippolyte Gisserot-Boukhlef and Tom Aarsen and Jan Kostkan and Konrad Wojtasik and Taemin Lee and Marek Šuppa and Crystina Zhang and Roberta Rocca and Mohammed Hamdy and Andrianos Michail and John Yang and Manuel Faysse and Aleksei Vatolin and Nandan Thakur and Manan Dey and Dipam Vasani and Pranjal Chitale and Simone Tedeschi and Nguyen Tai and Artem Snegirev and Michael Günther and Mengzhou Xia and Weijia Shi and Xing Han Lù and Jordan Clive and Gayatri Krishnakumar and Anna Maksimova and Silvan Wehrli and Maria Tikhonova and Henil Panchal and Aleksandr Abramov and Malte Ostendorff and Zheng Liu and Simon Clematide and Lester James Miranda and Alena Fenogenova and Guangyu Song and Ruqiya Bin Safi and Wen-Ding Li and Alessia Borghini and Federico Cassano and Hongjin Su and Jimmy Lin and Howard Yen and Lasse Hansen and Sara Hooker and Chenghao Xiao and Vaibhav Adlakha and Orion Weller and Siva Reddy and Niklas Muennighoff},
  publisher = {arXiv},
  journal={arXiv preprint arXiv:2502.13595},
  year={2025},
  url={https://arxiv.org/abs/2502.13595},
  doi = {10.48550/arXiv.2502.13595},
}

If you use any of the specific benchmarks, we also recommend that you cite the authors of both the benchmark and its tasks:

benchmark = mteb.get_benchmark("MTEB(eng, v2)")
benchmark.citation # get citation for a specific benchmark

# you can also create a table of the task for the appendix using:
benchmark.tasks.to_latex()

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mteb-2.10.10.tar.gz (3.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mteb-2.10.10-py3-none-any.whl (5.2 MB view details)

Uploaded Python 3

File details

Details for the file mteb-2.10.10.tar.gz.

File metadata

  • Download URL: mteb-2.10.10.tar.gz
  • Upload date:
  • Size: 3.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mteb-2.10.10.tar.gz
Algorithm Hash digest
SHA256 d3b0c8c8b594b5cbd4d1584e0eb539ffa0bcb6a4e147e830fa24e00a80ddd149
MD5 3ffb533665ad1c551e02fa2feb51b5b1
BLAKE2b-256 bb1f9c33ecf68acab584960f9be62acb64c4d9123ef044482dfcbfe470a3c095

See more details on using hashes here.

Provenance

The following attestation bundles were made for mteb-2.10.10.tar.gz:

Publisher: release.yml on embeddings-benchmark/mteb

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mteb-2.10.10-py3-none-any.whl.

File metadata

  • Download URL: mteb-2.10.10-py3-none-any.whl
  • Upload date:
  • Size: 5.2 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mteb-2.10.10-py3-none-any.whl
Algorithm Hash digest
SHA256 9198c9381adc465e97153a293b2de073ce76b032e4dda6c1dd192880304ce663
MD5 c98fdd10f6bce7fb7fc7ead17325d013
BLAKE2b-256 a45294509c0bbb44f6ab94e01211cd3cf6255afc0d4c929eca06ddfdb3c66bd8

See more details on using hashes here.

Provenance

The following attestation bundles were made for mteb-2.10.10-py3-none-any.whl:

Publisher: release.yml on embeddings-benchmark/mteb

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page