Skip to main content

Image-text alignment metric trained on cycle consistency preferences

Project description

Cycle Consistency as Reward: Learning Image-Text Alignment without Human Preferences

Paper | Project Page | Dataset (I2T) | Dataset (T2I) | Dataset Viewer

Hyojin Bahng*, Caroline Chan*, Fredo Durand, Phillip Isola.
(*Equal contribution, alphabetical order.)
MIT CSAIL.

CycleReward is a reward model trained on preferences derived from cycle consistency. Given a forward mapping $$F:X \rightarrow Y$$ and a backward mapping $$G: Y \rightarrow X$$, we define cycle consistency score as the similarity between the original input $$x$$ and its reconstruction $$G(F(x))$$. This score serves as a proxy for preference: higher cycle consistency indicates a preferred output. This provides a more scalable and cheaper signal for learning image-text alignment compared to human supervision. We construct CyclePrefDB, a large-scale preference dataset comprising 866K comparison pairs spanning image-to-text and text-to-image generation, with an emphasis on dense captions and prompts. Trained on this dataset, CycleReward matches or surpasses models trained on human or AI feedback.

Quick Start

Run pip install cyclereward. The following Python code is all you need.

The basic use case is to measure the alignment between an image and a caption. A higher score means more similar, lower means more different. We release three model variants: CycleReward-Combo, CycleReward-I2T, CycleReward-T2I.

from cyclereward import cyclereward
from PIL import Image
import torch 

device = "cuda" if torch.cuda.is_available() else "cpu"
model, preprocess = cyclereward(device=device, model_type="CycleReward-Combo")

caption = "a photo of a cat"
image = preprocess(Image.open("image_path")).unsqueeze(0).to(device)
score = model.score(image, caption) 

CyclePrefDB Dataset

CycleReward is trained on CyclePrefDB, a large-scale preference dataset based on cycle consistency. We provide comparison pairs for both image-to-text (I2T) and text-to-image (T2I) generation, with a focus on dense captions and prompts.

Dataset Number of Pairs
CyclePrefDB-I2T 398K
CyclePrefDB-T2I 468K

You can use the Hugging Face Datasets library to load the datasets:

from datasets import load_dataset

# Load dataset
dataset = load_dataset("carolineec/CyclePrefDB-I2T", split='train')

Citation

If you find our work or any of our materials useful, please cite our paper:

@article{bahng2025cycle,
    title={Cycle Consistency as Reward: Learning Image-Text Alignment without Human Preferences},
    author= {Bahng, Hyojin and Chan, Caroline and Durand, Fredo and Isola, Phillip},
    journal={arXiv preprint arXiv:2506.02095},
    year={2025}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cyclereward-0.1.7.tar.gz (21.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cyclereward-0.1.7-py3-none-any.whl (21.3 kB view details)

Uploaded Python 3

File details

Details for the file cyclereward-0.1.7.tar.gz.

File metadata

  • Download URL: cyclereward-0.1.7.tar.gz
  • Upload date:
  • Size: 21.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.13

File hashes

Hashes for cyclereward-0.1.7.tar.gz
Algorithm Hash digest
SHA256 939819b53900e7cadf5aa732cd52236aafee2fb622f690ed02d2568ae05c7738
MD5 49b8cbc1366a5c7422d5422aafbc5ed7
BLAKE2b-256 4b0b79cd4ab38a12870ac17ab8c9f2aebaed0763d4f53541d25b7383467e6a23

See more details on using hashes here.

File details

Details for the file cyclereward-0.1.7-py3-none-any.whl.

File metadata

  • Download URL: cyclereward-0.1.7-py3-none-any.whl
  • Upload date:
  • Size: 21.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.13

File hashes

Hashes for cyclereward-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 4a361e21454c7871f2393798e868af028888443a76ae5a983938d53ee273d095
MD5 efdfaf996e72a6a27b57083b2d0ecbe0
BLAKE2b-256 5c5c950d74cb2a8b3145893fd7042c13718709355934d6b70222bfff8d323611

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page