Skip to main content

Easily computing clip embeddings and building a clip retrieval system with them

Project description

clip-retrieval

pypi Open In Colab Try it on gitpod

Easily computing clip embeddings and building a clip retrieval system with them.

  • clip inference allows you to quickly (1500 sample/s on a 3080) compute image and text embeddings
  • clip index builds efficient indices out of the embeddings
  • clip filter allows you to filter out the data using the clip index
  • clip back hosts the indices with a simple flask service
  • clip service is a simple ui querying the back

End to end this make it possible to build a simple semantic search system. Interested to learn about semantic search in general ? You can read by medium post on the topic.

Install

pip install clip-retrieval

clip inference

Get some images in an example_folder, for example by doing:

pip install img2dataset
echo 'https://placekitten.com/200/305' >> myimglist.txt
echo 'https://placekitten.com/200/304' >> myimglist.txt
echo 'https://placekitten.com/200/303' >> myimglist.txt
img2dataset --url_list=myimglist.txt --output_folder=image_folder --thread_count=64 --image_size=256

You can also put text files with the same names as the images in that folder, to get the text embeddings.

Then run clip-retrieval inference --input_dataset image_folder --output_folder embeddings_folder

Output folder will contain:

  • img_emb/
    • img_emb_0.npy containing the image embeddings as numpy
  • text_emb/
    • text_emb_0.npy containing the text embeddings as numpy
  • metadata/
    • metadata_0.parquet containing the image paths, captions and metadata

API

clip_inference turn a set of text+image into clip embeddings

  • input_dataset Path to input dataset. Folder if input_format is files. Bash brace pattern such as "{000..150}.tar" (see https://pypi.org/project/braceexpand/) if webdataset (required)
  • output_folder Folder where the clip embeddings will be saved, as well as metadata (required)
  • input_format files or webdataset (default files)
  • cache_path cache path for webdataset (default None)
  • batch_size Number of items to do the inference on at once (default 256)
  • num_prepro_workers Number of processes to do the preprocessing (default 8)
  • enable_text Enable text processing (default True)
  • enable_image Enable image processing (default True)
  • enable_metadata Enable metadata processing (default False)
  • write_batch_size Write batch size (default 10**6)
  • subset_size Only process a subset of this size (default None)

Clip index

Clip index takes as input the output of clip inference and makes an index out of it using autofaiss

clip-retrieval index --input_folder embeddings_folder --output_folder index_folder

The output is a folder containing:

  • image.index containing a brute force faiss index for images
  • text.index containing a brute force faiss index for texts
  • metadata.arrow containing the metadata in a format that is easy to memory map

Clip filter

Once the embeddings are computed, you may want to filter out the data by a specific query. For that you can run clip-retrieval filter --query "cat" --output_folder "cat/" --indice_folder "indice_folder" It will copy the 100 best images for this query in the output folder. Using the --num_results or --threshold may be helpful to refine the filter

Clip back

Then run (output_folder is the output of clip index)

echo '{"example_index": "output_folder"}' > indices_paths.json
clip-retrieval back --port 1234 --indices-paths indices_paths.json

At this point you have a simple flask server running on port 1234 and that can answer these queries:

  • /indices-list -> return a list of indices
  • /knn-service that takes as input:
{
    "text": "a text query",
    "image": "a base64 image",
    "modality": "image", // image or text index to use
    "num_images": 4, // number of output images
    "indice_name": "example_index"
}

and returns:

[
    {
        "image": "base 64 of an image",
        "text": "some result text"
    },
    {
        "image": "base 64 of an image",
        "text": "some result text"
    }
]

For development

Either locally, or in gitpod (do export PIP_USER=false there)

Setup a virtualenv:

python3 -m venv .env
source .env/bin/activate
pip install -U pip
pip install -e .

to run tests:

pip install -r requirements-test.txt

then

python -m pytest -v tests -s

Project details


Release history Release notifications | RSS feed

This version

2.0.2

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

clip_retrieval-2.0.2.tar.gz (11.6 kB view hashes)

Uploaded Source

Built Distribution

clip_retrieval-2.0.2-py3-none-any.whl (13.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page