Skip to main content

clip plugin for embcli

Project description

embcli-clip

PyPI GitHub Actions Workflow Status PyPI - Python Version

CLIP plugin for embcli, a command-line interface for embeddings.

Reference

Installation

pip install embcli-clip

Quick Start

Try out the Multimodal Embedding Models

desc.txt

A ginger cat with bright green eyes, lazily stretching out on a sun-drenched windowsill.

gingercat.jpeg gingercat.jpeg

# show general usage of emb command.
emb --help

# list all available models.
emb models
CLIPModel
    Vendor: clip
    Models:
    * clip (aliases: )
    See https://huggingface.co/openai?search_models=clip for available local models.
    Model Options:

# get an embedding for an input text by an original CLIP model (openai/clip-vit-base-patch32)
# it'll take a while to download the model from Hugging Face Hub for the first time.
emb embed -m clip -f desc.txt

# get an embedding for an input image by an original CLIP model.
emb embed -m clip --image gingercat.jpeg

# get an embedding model by a community model.
emb embed -m clip/laion/CLIP-ViT-H-14-laion2B-s32B-b79K --image gingercat.jpeg

# calculate similarity score between a text and an image. the default metric is cosine similarity.
emb simscore -m clip -f1 desc.txt --image2 gingercat.jpeg
0.33982698978267567

Document Indexing and Multimodal Search

You can use the emb command to index documents and perform search by an image. emb uses chroma for the default vector database.

# index example documents in the current directory.
emb ingest-sample -m clip -c catcafe --corpus cat-names-en

# or, you can give the path to your documents.
# the documents should be in a CSV file with two columns: id and text. the separator should be comma.
emb ingest -m clip -c <collection-name> -f <path-to-your-documents>

# search for an image in the indexed documents.
emb search -m clip -c catcafe --image gingercat.jpeg
Found 5 results:
Score: 0.008130492317462625, Document ID: 14, Text: Milo: Milo is an endlessly curious and adventurous orange tabby, always the first to investigate new sounds or objects. He is incredibly friendly, greeting everyone with enthusiastic meows and leg-rubs. Milo loves interactive toys and will happily follow his humans around, eager to be involved in every household activity.
Score: 0.00806729872159855, Document ID: 54, Text: Jasper (II): Jasper the Second, distinct from his predecessor, is a playful and highly energetic ginger tom. He loves to chase, tumble, and explore every nook and cranny with boundless enthusiasm. Jasper is also incredibly affectionate, always ready for a cuddle after a vigorous play session, a bundle of orange joy.
Score: 0.007995471315075445, Document ID: 8, Text: Oliver (Ollie): Ollie is a charmingly goofy orange tabby, full of curious energy and playful pounces. He’s incredibly friendly, often greeting visitors with a cheerful chirp and a head-butt. He loves food, interactive toys, and will happily follow his humans around, always eager to be part of the action.
Score: 0.007992460725066777, Document ID: 71, Text: Archie: Archie is a friendly and slightly goofy ginger cat, always up for a bit of fun and a good meal. He is very sociable and loves attention from anyone willing to give it. Archie enjoys playful wrestling and will often follow his humans around, offering cheerful chirps and affectionate head-bumps.
Score: 0.007982146864511108, Document ID: 42, Text: Sammy: Sammy is a laid-back and friendly ginger cat, always happy to see you. He enjoys lounging in comfortable spots but is also up for a gentle play session. Sammy is a great companion for a relaxed household, offering quiet affection and a warm, purring presence without demanding constant attention.

# or, you can search for a text.
emb search -m clip -c catcafe -q "A lazy ginger cat stretching in the sun"
Found 5 results:
Score: 0.02344505173322954, Document ID: 42, Text: Sammy: Sammy is a laid-back and friendly ginger cat, always happy to see you. He enjoys lounging in comfortable spots but is also up for a gentle play session. Sammy is a great companion for a relaxed household, offering quiet affection and a warm, purring presence without demanding constant attention.
Score: 0.023361588000522227, Document ID: 15, Text: Finn: Finn is a spirited and agile ginger cat, always ready for an adventure. He excels at climbing and exploring high places, often surprising his humans with his acrobatic feats. Playful and energetic, Finn loves interactive games and will keep you entertained with his boundless enthusiasm and charming persistence for play.
Score: 0.0229521169270549, Document ID: 8, Text: Oliver (Ollie): Ollie is a charmingly goofy orange tabby, full of curious energy and playful pounces. He’s incredibly friendly, often greeting visitors with a cheerful chirp and a head-butt. He loves food, interactive toys, and will happily follow his humans around, always eager to be part of the action.
Score: 0.022435629708038293, Document ID: 14, Text: Milo: Milo is an endlessly curious and adventurous orange tabby, always the first to investigate new sounds or objects. He is incredibly friendly, greeting everyone with enthusiastic meows and leg-rubs. Milo loves interactive toys and will happily follow his humans around, eager to be involved in every household activity.
Score: 0.022339791099637154, Document ID: 54, Text: Jasper (II): Jasper the Second, distinct from his predecessor, is a playful and highly energetic ginger tom. He loves to chase, tumble, and explore every nook and cranny with boundless enthusiasm. Jasper is also incredibly affectionate, always ready for a cuddle after a vigorous play session, a bundle of orange joy.

Development

See the main README for general development instructions.

Run Tests

RUN_CLIP_TESTS=1 uv run --package embcli-clip pytest packages/embcli-clip/tests/

Run Linter and Formatter

uv run ruff check --fix packages/embcli-clip
uv run ruff format packages/embcli-clip

Run Type Checker

uv run --package embcli-clip pyright packages/embcli-clip

Build

uv build --package embcli-clip

License

Apache License 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

embcli_clip-0.1.0.tar.gz (413.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

embcli_clip-0.1.0-py3-none-any.whl (5.1 kB view details)

Uploaded Python 3

File details

Details for the file embcli_clip-0.1.0.tar.gz.

File metadata

  • Download URL: embcli_clip-0.1.0.tar.gz
  • Upload date:
  • Size: 413.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.9

File hashes

Hashes for embcli_clip-0.1.0.tar.gz
Algorithm Hash digest
SHA256 a8d2113049b9ad78f6a28dd7145be7998abc34a622e81b69ea6aee77444101af
MD5 a73ede4d34bd58627e3d8806c74aa2fb
BLAKE2b-256 8dd93d2920799ebf021916e57db40129ae96d786b95a6d16f3369af5b0c48961

See more details on using hashes here.

File details

Details for the file embcli_clip-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for embcli_clip-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e31f4654216a3c5335302e244f959a452d0b9a4ac20ef9c06ef56a8c828e7765
MD5 aad7e7341b22296bd1502aabdeb7c3ed
BLAKE2b-256 0086a42e0a73c2bd77e40233f7c7dd0efd34cf4c35e4d24426dc61addf585813

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page