Code used to train ColPali
Project description
ColPali: Efficient Document Retrieval with Vision Language Models 👀
[Model card] [ViDoRe Benchmark] [ViDoRe Leaderboard] [Demo] [Blog Post]
[!TIP] If you want to try the pre-trained ColPali on your own documents, you should use the
vidore-benchmarkrepository. It comes with a Python package and a CLI tool for convenient evaluation.
Associated Paper
ColPali: Efficient Document Retrieval with Vision Language Models Manuel Faysse*, Hugues Sibille*, Tony Wu* Bilel Omrani, Gautier Viaud, Céline Hudelot, Pierre Colombo (*Equal Contribution)
This repository contains the code used for training the vision retrievers in the paper. In particular, it contains the code for training the ColPali model, which is a vision retriever based on the ColBERT architecture.
Setup
We used Python 3.11.6 and PyTorch 2.2.2 to train and test our models, but the codebase is expected to be compatible with Python >=3.9 and recent PyTorch versions.
The eval codebase depends on a few Python packages, which can be downloaded using the following command:
pip install colpali-engine
To keep a lightweight repository, only the essential packages were installed. In particular, you must specify the dependencies to use the training script for ColPali. You can do this using the following command:
pip install "colpali-engine[train]"
Usage
The scripts/ directory contains scripts to run training and inference.
Inference
While there is an inference script in this repository, it's recommended to run inference using the vidore-benchmark package.
Training
All the model configs used can be found in scripts/configs/ and rely on the configue package for straightforward configuration. They should be used with the train_colbert.py script.
Example 1: Local training
USE_LOCAL_DATASET=0 python scripts/train/train_colbert.py scripts/configs/siglip/train_siglip_model_debug.yaml
or using accelerate:
accelerate launch scripts/train/train_colbert.py scripts/configs/train_colidefics_model.yaml
Example 2: Training on a SLURM cluster
sbatch --nodes=1 --cpus-per-task=16 --mem-per-cpu=32GB --time=20:00:00 --gres=gpu:1 -p gpua100 --job-name=colidefics --output=colidefics.out --error=colidefics.err --wrap="accelerate launch scripts/train/train_colbert.py scripts/configs/train_colidefics_model.yaml"
sbatch --nodes=1 --time=5:00:00 -A cad15443 --gres=gpu:8 --constraint=MI250 --job-name=colpali --wrap="python scripts/train/train_colbert.py scripts/configs/train_colpali_model.yaml"
Paper result reproduction
To reproduce the results from the paper, you should checkout to the v0.1.1 tag or install the corresponding colpali-engine package release using:
pip install colpali-engine==0.1.1
Citation
ColPali: Efficient Document Retrieval with Vision Language Models
- First authors: Manuel Faysse*, Hugues Sibille*, Tony Wu* (*Equal Contribution)
- Contributors: Bilel Omrani, Gautier Viaud, Céline Hudelot, Pierre Colombo
@misc{faysse2024colpaliefficientdocumentretrieval,
title={ColPali: Efficient Document Retrieval with Vision Language Models},
author={Manuel Faysse and Hugues Sibille and Tony Wu and Bilel Omrani and Gautier Viaud and Céline Hudelot and Pierre Colombo},
year={2024},
eprint={2407.01449},
archivePrefix={arXiv},
primaryClass={cs.IR},
url={https://arxiv.org/abs/2407.01449},
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file colpali_engine-0.1.1.tar.gz.
File metadata
- Download URL: colpali_engine-0.1.1.tar.gz
- Upload date:
- Size: 30.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.0 CPython/3.12.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5f44829727ff6e3c24b32702733a357f7d318dc87c80ef0d987e25bdd85eacdf
|
|
| MD5 |
b76642c06e24844ba27685627b2db777
|
|
| BLAKE2b-256 |
9daa520fbd4681d8d3322fa36be408f7c10a864bffe6330205bc2c7d76129a17
|
File details
Details for the file colpali_engine-0.1.1-py3-none-any.whl.
File metadata
- Download URL: colpali_engine-0.1.1-py3-none-any.whl
- Upload date:
- Size: 34.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.0 CPython/3.12.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4006bb709791d8b63e8152e61bdba85cb52b7bc353437806508ed6f77b5d0e9b
|
|
| MD5 |
f72bbfd324c817ef4d3ef2ffcd290009
|
|
| BLAKE2b-256 |
df81739907cddf582335bf5b46c6b91c9d8b60c87e66951a1bff99bf5fc0a5b0
|