Skip to main content

Compare image prompts using CLIP

Project description

clip-gaze

An art analysis tool powered by CLIP.

Motivation

Diffusion models (such as Stable Diffusion) used OpenAI's CLIP in order to perform textual analysis of their training data. Precisely what these machine learning systems actually learned from their training data is opaque. This tool helps us understand how CLIP, and therefore the models that use CLIP, see images.

What does it do?

Given an image and a series of (text) phrases it calculates the relative likelihood of each phrase to be a good description of the image. Note that this is not the same thing as "given a text phrase, calculate the accuracy of that phrase".

An example

Let's show it the painting "Brücke über die Marne bei Creteil" by Cézanne. If we download the 2,175 × 1,713 pixel version of the painting and open it (e.g. using PIL.Image.open from the package pillow) as image we can then pass it to the gaze command.

# Assuming you have already saved the image
import clip_gaze
clip_gaze.gaze(image, clip_gaze.MOVEMENTS)

Just looking at the highest probability outputs for clip_gaze.ARTISTS_BY_TRAINING_PREVALENCE, clip_gaze.Movements, and clip_gaze.SURFACES we see:

{'artist': ['by paul cézanne (82%)'], 'movement': ['tonalism movement (16%)'], 'surface': ['on canvas (86%)']}

So the terms "by paul cézanne", "tonalism movement", and "on canvas" are the most likely to describe the input image.

For more examples and fuller documentation please see the project's page on github.

How does it work?

CLIP is a tool provided by OpenAI that calculates the similarity between an image and some text. This is a machine learning system trained on an enormous amount of data, and that data will contain biases (intentional and unintentional). It is not a source of truth, but a useful tool to give you ideas about where to search next.

This tool works by downloading CLIP onto your computer and running it locally. This is not an easy task for all computers, especially older ones. See the "Arguments for gaze" section of the project's README for a way to change memory load.

Biases

This software is built on a machine learning system, and the biases in this tool come in two parts:

  1. CLIP itself comes with its own biases, and we refer the user to OpenAI's own work on explaining and mitigating that bias
  2. The lists of chosen phrases

The lists used in this software are primarily from Wikipedia and from the training data that CLIP used. Neither of these sources are perfect, and care should be taken when using this software to account for these biases where possible. Although the lists are long (e.g. the list of 6000 artists) there are no claims of completeness or relative importance made.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

clip-gaze-0.3.0.tar.gz (117.2 kB view details)

Uploaded Source

Built Distribution

clip_gaze-0.3.0-py3-none-any.whl (138.7 kB view details)

Uploaded Python 3

File details

Details for the file clip-gaze-0.3.0.tar.gz.

File metadata

  • Download URL: clip-gaze-0.3.0.tar.gz
  • Upload date:
  • Size: 117.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.10

File hashes

Hashes for clip-gaze-0.3.0.tar.gz
Algorithm Hash digest
SHA256 07589aad151b01e85d91e7e964a51b46f772e0806bb578aae427c9b0ee5b6ff8
MD5 6f109bf0a8e20da8b7cb132a96449760
BLAKE2b-256 2b1a9f8fb828c0db139723bb0895975c2adac84af1d9dea367af9fc71e39c233

See more details on using hashes here.

File details

Details for the file clip_gaze-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: clip_gaze-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 138.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.10

File hashes

Hashes for clip_gaze-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5fa0ae0f65fff025f24e57cf6fc0afb87acb33dbc7a23b346a9e53531069ce24
MD5 ee24a38eb4318c5bd1c7f5672cc5c483
BLAKE2b-256 1227ad200aaa383b8ddbb59d0d71ab7ce7d5d141258117996bde036a1bee1b78

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page