CLIP inference with no big dependencies as PyTorch, TensorFlow, Numpy

These details have not been verified by PyPI

Project links

GitHub Statistics

Project description

Python bindings for clip.cpp

This package provides basic Python bindings for clip.cpp.

It requires no third-party libraries and no big dependencies such as PyTorch, TensorFlow, Numpy, ONNX etc.

Install

If you are on a X64 Linux distribution, you can simply Pip-install it:

pip install clip_cpp

If you are on another operating system or architecture, or if you want to make use of support for instruction sets other than AVX2 (e.g., AVX512), you can build it from source. Se clip.cpp for more info.

All you need to do is to compile with the -DBUILD_SHARED_LIBS=ON option and copy libclip.so to examples/python_bindings/clip_cpp.

Usage

Clip Class

The Clip class provides a Python interface to clip.cpp, allowing you to perform various tasks such as text and image encoding, similarity scoring, and text-image comparison. Below are the constructor and public methods of the Clip class:

Constructor

def __init__(self, model_file: str, verbosity: int = 0):

Description: Initializes a Clip instance with the specified CLIP model file and optional verbosity level.
model_file (str): The path to the CLIP model file.
verbosity (int, optional): An integer specifying the verbosity level (default is 0).

Public Methods

1. `vision_config`

@property
def vision_config(self) -> Dict[str, Any]:

Description: Retrieves the configuration parameters related to the vision component of the CLIP model.

2. `text_config`

@property
def text_config(self) -> Dict[str, Any]:

Description: Retrieves the configuration parameters related to the text component of the CLIP model.

3. `tokenize`

def tokenize(self, text: str) -> List[int]:

Description: Tokenizes a text input into a list of token IDs.
text (str): The input text to be tokenized.

4. `encode_text`

def encode_text(
    self, tokens: List[int], n_threads: int = os.cpu_count()
) -> List[float]:

Description: Encodes a list of token IDs into a text embedding.
tokens (List[int]): A list of token IDs obtained through tokenization.
n_threads (int, optional): The number of CPU threads to use for encoding (default is the number of CPU cores).

5. `load_preprocess_encode_image`

def load_preprocess_encode_image(
    self, image_path: str, n_threads: int = os.cpu_count()
) -> List[float]:

Description: Loads an image, preprocesses it, and encodes it into an image embedding.
image_path (str): The path to the image file to be encoded.
n_threads (int, optional): The number of CPU threads to use for encoding (default is the number of CPU cores).

6. `calculate_similarity`

def calculate_similarity(
    self, text_embedding: List[float], image_embedding: List[float]
) -> float:

Description: Calculates the similarity score between a text embedding and an image embedding.
text_embedding (List[float]): The text embedding obtained from encode_text.
image_embedding (List[float]): The image embedding obtained from load_preprocess_encode_image.

7. `compare_text_and_image`

def compare_text_and_image(
    self, text: str, image_path: str, n_threads: int = os.cpu_count()
) -> float:

Description: Compares a text input and an image file, returning a similarity score.
text (str): The input text.
image_path (str): The path to the image file for comparison.
n_threads (int, optional): The number of CPU threads to use for encoding (default is the number of CPU cores).

8. `del`

def __del__(self):

Description: Destructor that frees resources associated with the Clip instance.

With the Clip class, you can easily work with the CLIP model for various natural language understanding and computer vision tasks.

Example

A basic example can be found in the clip.cpp examples.

python example_main.py --help                                
usage: clip [-h] -m MODEL [-v VERBOSITY] -t TEXT -i IMAGE                                                               
                                                                                                                        
optional arguments:                                                                                                     
  -h, --help            show this help message and exit                                                                 
  -m MODEL, --model MODEL                                                                                               
                        path to GGML file                                                                               
  -v VERBOSITY, --verbosity VERBOSITY                                                                                   
                        Level of verbosity. 0 = minimum, 2 = maximum                                                    
  -t TEXT, --text TEXT  text to encode                                                                                  
  -i IMAGE, --image IMAGE                                                                                               
                        path to an image file

Bindings to the DLL are implemented in clip_cpp/clip.py and

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

Release history Release notifications | RSS feed

0.5.0

Sep 27, 2023

0.4.2

Sep 20, 2023

0.4.1

Sep 18, 2023

0.4.0

Sep 16, 2023

0.3.6

Sep 14, 2023

0.3.5

Sep 13, 2023

0.3.4

Sep 13, 2023

0.3.3

Sep 13, 2023

0.3.2

Sep 13, 2023

This version

0.3.1

Sep 12, 2023

0.3.0

Sep 12, 2023

0.2.0

Sep 10, 2023

0.1.0

Sep 10, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

clip_cpp-0.3.1.tar.gz (361.9 kB view hashes)

Uploaded Sep 12, 2023 Source

Built Distribution

clip_cpp-0.3.1-py3-none-any.whl (362.0 kB view hashes)

Uploaded Sep 12, 2023 Python 3

Hashes for clip_cpp-0.3.1.tar.gz

Hashes for clip_cpp-0.3.1.tar.gz
Algorithm	Hash digest
SHA256	`b890170ed40d5389aecba4b1653ce5b452c1d5348c6bc0b58fd1b70896257a22`
MD5	`e3cefa41df5cc2a9288547bd7b5def56`
BLAKE2b-256	`fabb0b795383fca62ea7fd8c114d276709889ec1a3d615a25c73b49c36c529a8`

Hashes for clip_cpp-0.3.1-py3-none-any.whl

Hashes for clip_cpp-0.3.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`24fa20ad850904ae5ffa93fadb25a129a16f1ad3e01301465f3c4b964f925e8c`
MD5	`3fc59532314da76eaa3018ee2ee52aa8`
BLAKE2b-256	`561f9fe301b2f41c0675c92d19ab57510e53580c59f5b52d258e33881ce95444`

clip-cpp 0.3.1

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

Python bindings for clip.cpp

Install

Usage

Clip Class

Constructor

Public Methods

1. `vision_config`

2. `text_config`

3. `tokenize`

4. `encode_text`

5. `load_preprocess_encode_image`

6. `calculate_similarity`

7. `compare_text_and_image`

8. `del`

Example

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

clip-cpp 0.3.1

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

Python bindings for clip.cpp

Install

Usage

Clip Class

Constructor

Public Methods

1. vision_config

2. text_config

3. tokenize

4. encode_text

5. load_preprocess_encode_image

6. calculate_similarity

7. compare_text_and_image

8. __del__

Example

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

1. `vision_config`

2. `text_config`

3. `tokenize`

4. `encode_text`

5. `load_preprocess_encode_image`

6. `calculate_similarity`

7. `compare_text_and_image`

8. `del`