Package for calculating CLIP-Score using PyTorch

These details have not been verified by PyPI

Project links

Homepage

Project description

CLIP Score for PyTorch

This repository provides a batch-wise quick processing for calculating CLIP scores. It uses the pretrained CLIP model to measure the cosine similarity between two modalities. The project structure is adapted from pytorch-fid and CLIP.

🔄 Update Notes (April 2025)

The latest version introduces several major improvements:

✅ Hugging Face integration: The CLIP model is now loaded from the Hugging Face Hub, using the implementation provided by the transformers library. This makes installation and model loading significantly more convenient and consistent across environments.

✅ Default model: The default model is now set to openai/clip-vit-base-patch32.

✅ Flexible text input: The text input is no longer restricted to a folder. You can now provide a single sentence as the prompt, enabling computation of the CLIP score between a batch of images and a fixed textual description.

✅ Normalized CLIP score: The previous 100x scaling factor for the CLIP score has been removed. The current version outputs cosine similarity values directly, ensuring that the resulting CLIP scores remain strictly less than 1.

🛠️ TODO List

Add support for additional vision-language models such as DINO, BLIP, and their variants.

Benchmark and compare multiple model types on common datasets.

Extend evaluation scripts to support batch text prompts and multiple-language inputs.

Add visualization tools for score distributions and attention maps.

Installation

Requirements:

Install PyTorch:

pip install torch  # Choose a version that suits your GPU

Install clip-score from PyPI:
```
pip install clip-score
```
[Optional] If you want to modify this project on your own, you can install it with:
```
git clone https://github.com/Taited/clip-score && cd clip-score
pip install -e .
```

Data Input Specifications

This project is designed to process paired images and text files, and therefore requires two directories: one for images and one for text files.

Image Files

All images should be stored in a single directory. The image files can be in either .png or .jpg format.

Text Files

All text data should be contained in plain text files in a separate directory. These text files should have the extension .txt.

File Number and Naming

The number of files in the image directory should be exactly equal to the number of files in the text directory. Additionally, the files in the image directory and text directory should be paired by file name. For instance, if there is a cat.png in the image directory, there should be a corresponding cat.txt in the text directory.

Directory Structure Example

Below is an example of the expected directory structure:

├── path/to/image
│   ├── cat.png
│   ├── dog.png
│   └── bird.jpg
└── path/to/text
    ├── cat.txt
    ├── dog.txt
    └── bird.txt

In this example, cat.png is paired with cat.txt, dog.png is paired with dog.txt, and bird.jpg is paired with bird.txt.

Please adhere to the specified structure to ensure correct operation of the program. If there are any questions or issues, feel free to raise an issue here on GitHub.

Usage

To compute the CLIP score between images and texts, make sure that the image and text data are contained in two separate folders, and each sample has the same name in both modalities. Run the following command:

python -m clip_score path/to/image path/to/text

Or, if you want to calculate the CLIP score across all images using the same prompt, you can run:

python -m clip_score path/to/image "your prompt here"

If GPU is available, the project is set to run automatically on a GPU by default. If you want to specify a particular GPU, you can use the --device cuda:N flag when running the script, where N is the index of the GPU you wish to use. In case you want to run the program on a CPU instead, you can specify this by using the --device cpu flag.

Computing CLIP Score within the Same Modality

If you want to calculate the CLIP score within the same modality (e.g., image-image or text-text), follow the same folder structure as mentioned above. Additionally, specify the preferred modalities using the --real_flag and --fake_flag options. By default, --real_flag=img and --fake_flag=txt. Examples:

python -m clip_score path/to/imageA path/to/imageB --real_flag img --fake_flag img
python -m clip_score path/to/textA path/to/textB --real_flag txt --fake_flag txt

Citing

If you use this repository in your research, consider citing it using the following Bibtex entry:

@misc{taited2023CLIPScore,
  author={SUN Zhengwentai},
  title={{clip-score: CLIP Score for PyTorch}},
  month={March},
  year={2023},
  note={Version 0.2.1},
  howpublished={\url{https://github.com/taited/clip-score}},
}

License

This implementation is licensed under the Apache License 2.0.

The project structure is adapted from mseitzer's pytorch-fid project. The CLIP model is adapted from OpenAI's CLIP.

The CLIP Score was introduced in OpenAI's Learning Transferable Visual Models From Natural Language Supervision.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.2.1

Apr 14, 2025

0.1.1

Jul 20, 2023

0.1.0

Mar 9, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

clip_score-0.2.1.tar.gz (12.8 kB view details)

Uploaded Apr 14, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

clip_score-0.2.1-py2.py3-none-any.whl (11.4 kB view details)

Uploaded Apr 14, 2025 Python 2Python 3

File details

Details for the file clip_score-0.2.1.tar.gz.

File metadata

Download URL: clip_score-0.2.1.tar.gz
Upload date: Apr 14, 2025
Size: 12.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.14

File hashes

Hashes for clip_score-0.2.1.tar.gz
Algorithm	Hash digest
SHA256	`86838f9739e337be1a9741c3b9f49acd6873a25c617719dd99ace49d0e12a2a3`
MD5	`8af6b0f911e45628b5223a15b437842a`
BLAKE2b-256	`9fb089ae7061bd15e31d8a904194c20e97227852f8ec1cfa1826d1c13647fa93`

See more details on using hashes here.

File details

Details for the file clip_score-0.2.1-py2.py3-none-any.whl.

File metadata

Download URL: clip_score-0.2.1-py2.py3-none-any.whl
Upload date: Apr 14, 2025
Size: 11.4 kB
Tags: Python 2, Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.14

File hashes

Hashes for clip_score-0.2.1-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`99ca8b598ab917d33a2848a919cb530fcba4a998b2485722fbb958f669089683`
MD5	`d3d2f4200186aa8caf1caeea14a66cfe`
BLAKE2b-256	`104bba5a1f97e31ca9fbd0bf6154db468205ed359ef7997a3c5bdfc075572a39`

See more details on using hashes here.

clip-score 0.2.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

CLIP Score for PyTorch

🔄 Update Notes (April 2025)

🛠️ TODO List

Installation

Data Input Specifications

Image Files

Text Files

File Number and Naming

Directory Structure Example

Usage

Computing CLIP Score within the Same Modality

Citing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes