Relational Visual Similarity - A new visual similarity notion that captures the internal relational logic of a scene

These details have not been verified by PyPI

Project links

Project description

relsim


We introduce a new visual similarity notion — relational visual similarity (relsim) — which captures the internal relational logic of a scene rather than its surface appearance.

Relational Visual Similarity (arXiv 2025)
Thao Nguyen¹, Sicheng Mo³, Krishna Kumar Singh², Yilin Wang², Jing Shi², Nicholas Kolkin², Eli Shechtman², Yong Jae Lee^{1,2, ★}, Yuheng Li^{1, ★}
(★ Equal advising)
1- University of Wisconsin–Madison | 2- Adobe Research | 3- UCLA

TL;DR: We introduce a new visual similarity notion: relational visual similarity, which complements traditional attribute-based perceptual similarity (e.g., LPIPS, CLIP, DINO).

📦 Installation

Option 1: Install from GitHub (recommended)

pip install git+https://github.com/thaoshibe/relsim.git

Option 2: Install from local directory

git clone https://github.com/thaoshibe/relsim.git
cd relsim
pip install -e .  # Install in editable mode

Option 3: Install from PyPI (when published)

pip install relsim

🛠️ Usage

Given two images, you can compute their relational visual similarity like this:

from relsim.relsim_score import relsim
from PIL import Image

# Load model
model, preprocess = relsim(pretrained=True, checkpoint_dir="thaoshibe/relsim-qwenvl25-lora")

img1 = preprocess(Image.open("./anonymous_caption/bo.jpg"))
img2 = preprocess(Image.open("./anonymous_caption/mam.jpg"))
similarity = model(img1, img2)  # Returns similarity score (higher = more similar)
print(f"✅ Similarity score: {similarity:.4f}")
# you should see

For example, you should see reult

reference image	test image 1	test image 2	test image 3	test image 4	test image 5	test image 6

(to itself: 1.000)	0.981	0.830	0.808	0.767	0.465	0.223

🤗 You're welcome to improve the current relsim model! The training code is provided in ./relsim/ folder. For a quick jump to the training script: (Reminder: you need to download data here to run this code sucessfully)

cd relsim
# pip install -r requirements_train.txt
bash train.sh # this assume you have the dataset alrerady

### you might want to export WANDB and HF_TOKEN
# export WANDB_API_KEY='your_wandb_api_key'
# export HF_TOKEN='your_hf_token'

If you use wandb to log the result, your wandb should look like this

🫥 Anonymous Caption Model

Anonymous captions are image captions that do not refer to specific visible objects but instead capture the relational logic conveyed by the image.

The pretrained anonymous caption model (Qwen-VL-2.5 7B) is provided in ./anonymous_caption. This model is trained on a limited number of seed groups and their corresponding generated captions (you can see the training data here).

# run on default test image (mam.jpg)
python anonymous_caption/anonymous_caption.py

# run on your own images
python anonymous_caption/anonymous_caption.py --image_path $PATH_TO_IMAGE_OR_IMAGE_FOLDER

# if you need to see all arguments (e.g., batch size)
python anonymous_caption/anonymous_caption.py --help

Here is example of the generated captions with different runs.

Input image Generated captions (Different run)

Example: python anonymous_caption/anonymous_caption.py --image_path anonymous_caption/mam.jpg
Run 1: "Curious {Animal} peering out from behind a {Object}."
Run 2: "Curious {Animal} peeking out from behind the {Object} in an unexpected and playful way."
Run 3: "Curious {Cat} looking through a {Doorway} into the {Room}."
Run 4: "A curious {Animal} peeking from behind a {Barrier}."
Run 5: "A {Cat} peeking out from behind a {Door} with curious eyes."
...

Example: python anonymous_caption/anonymous_caption.py --image_path anonymous_caption/bo.jpg
Run 1: "Animals with {Leaf} artfully placed on their {Head}."
Run 2: "A {Dog} with a {Leaf} delicately placed on its head."
Run 3: "A {Dog} with a {Leaf} artfully placed on its head."
Run 4: "A {Dog} with a {Leaf} delicately placed on their head, representing the beauty of {Season}."
Run5: "Animals adorned with {Leaf} in a {Seasonal} setting."
...

Input image	Generated captions (Different run)
	Example: `python anonymous_caption/anonymous_caption.py --image_path anonymous_caption/mam.jpg` Run 1: "Curious {Animal} peering out from behind a {Object}." Run 2: "Curious {Animal} peeking out from behind the {Object} in an unexpected and playful way." Run 3: "Curious {Cat} looking through a {Doorway} into the {Room}." Run 4: "A curious {Animal} peeking from behind a {Barrier}." Run 5: "A {Cat} peeking out from behind a {Door} with curious eyes." ...
	Example: `python anonymous_caption/anonymous_caption.py --image_path anonymous_caption/bo.jpg` Run 1: "Animals with {Leaf} artfully placed on their {Head}." Run 2: "A {Dog} with a {Leaf} delicately placed on its head." Run 3: "A {Dog} with a {Leaf} artfully placed on its head." Run 4: "A {Dog} with a {Leaf} delicately placed on their head, representing the beauty of {Season}." Run5: "Animals adorned with {Leaf} in a {Seasonal} setting." ...

You are more than welcome to help improve the anonymous caption model! The current model may hallucinate or produce incorrect results, and sometimes it may generate captions that are not "anonymous enough"...

The training script for the anonymous caption model is shown below. Please check config.yaml for config details.

#########################################
#
#     train anonymous caption model 
#
#########################################

# (optional) install git lfs if you don't have
sudo apt update
sudo apt install git-lfs
git lfs install

# clone repo if you havent do that
git clone https://github.com/thaoshibe/relsim.git
cd relsim

# download the training data
cd anonymous_caption
git clone https://huggingface.co/datasets/thaoshibe/seed-groups
pip install -r requirements.txt
# run train
python anonymous_caption_train.py

*If you choose to log to wandb, your wandb should look like image below. Checkpoints will be saved in `./anonymous_caption/ckpt`.*

And your console should look like this:

📁 Data

🔍 You can see the snapshot of the data on this live website: 🔍🔍🔍 relsim: data viewer

Dataset name	Short description	JSON file	🔍 Data viewer
seed-groups	Use to train the anonymous captioning model	seed_group.json	See Seed Groups Dataset
anonymous-captions-114k	Use to train the relational similarity model	anonymous_captions_train.jsonl, anonymous_captions_test.jsonl	See Anonymous Captions Dataset

Each image will be given by their corresponding Image URL. Please see the json files in ./data.

(Optional) Depending on your internet speed, it should take under 0.5 hours to download all images with the default MAX_WORKER = 64. You can increase MAX_WORKER to speed up the download or reduce it depending on your machine (see the data/download_data.sh)

To download, please run this the data/download_data.sh

#########################################
#
#            download data
#
#########################################

git clone https://github.com/thaoshibe/relsim.git
cd relsim
bash data/download_data.sh # this script will download all dataset

Disclaimer

All images are extracted from LAION dataset. We do NOT own any of the images and we acknowledge the rights and contributions of the original creators. Please respect the authors of all images. These images are used for research purposes only.

BibTeX

@article{nguyen2025relsim,
  title={Relational Visual Similarity},
  author={Nguyen, Thao and Mo, Sicheng and Singh, Krishna Kumar and Wang, Yilin and Shi, Jing and Kolkin, Nicholas and Shechtman, Eli and Lee, Yong Jae and Li, Yuheng},
  journal={arXiv preprint arXiv:XXXX.XXXXX},
  year={2025}
}

---
The end; Thank you!

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Dec 1, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

relsim-0.1.0.tar.gz (20.4 kB view details)

Uploaded Dec 1, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

relsim-0.1.0-py3-none-any.whl (19.2 kB view details)

Uploaded Dec 1, 2025 Python 3

File details

Details for the file relsim-0.1.0.tar.gz.

File metadata

Download URL: relsim-0.1.0.tar.gz
Upload date: Dec 1, 2025
Size: 20.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.16

File hashes

Hashes for relsim-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`894585e1c740e9eb2795a57b55e83c8a783a4fe67c5698516a4568c4b0f2b459`
MD5	`e4792bf518cdc76a5de4965f340ecf92`
BLAKE2b-256	`b7ab3fce41200ab4c507f8f536922a411bdf260c3e7bffc6a45dc6111a0563b4`

See more details on using hashes here.

File details

Details for the file relsim-0.1.0-py3-none-any.whl.

File metadata

Download URL: relsim-0.1.0-py3-none-any.whl
Upload date: Dec 1, 2025
Size: 19.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.16

File hashes

Hashes for relsim-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0b0ebd01c862e267604bf05f6019185607f4572996bacb30b1f026c415340a7e`
MD5	`ad947edec53147ccab99374e07527fbe`
BLAKE2b-256	`ab3687606597e751ff817134331d387bc88e8e8e440428a3ab72786008c0f129`

See more details on using hashes here.

relsim 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

relsim

📦 Installation

Option 1: Install from GitHub (recommended)

Option 2: Install from local directory

Option 3: Install from PyPI (when published)

🛠️ Usage

🫥 Anonymous Caption Model

📁 Data

Disclaimer

BibTeX

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes