Generate captions for images with Salesforce BLIP
Project description
blip-caption
A CLI tool for generating captions for images using Salesforce BLIP.
Installation
Install this tool using pip
or pipx
:
pipx install blip-caption
The first time you use the tool it will download the model from the Hugging Face model hub.
The small model is 945MB. The large model is 1.8GB. The models will be downloaded and stored in ~/.cache/huggingface/hub/
the first time you use them.
Usage
To generate captions for an image using the small model, run:
blip-caption IMG_5825.jpeg
Example output:
a lizard is sitting on a branch in the woods
To use the larger model, add --large
:
blip-caption IMG_5825.jpeg --large
Example output:
there is a chamelon sitting on a branch in the woods
Here's the image I used:
If you pass multiple files the path to each file will be output before its caption:
blip-caption /tmp/photos/*.jpeg
/tmp/photos/IMG_2146.jpeg
a man holding a bowl of salad and laughing
/tmp/photos/IMG_0151.jpeg
a cat laying on a red blanket
JSON output
The --json
flag changes the output to look like this:
blip-caption /tmp/photos/*.* --json
[{"path": "/tmp/photos/IMG_2146.jpeg", "caption": "a man holding a bowl of salad and laughing"},,
{"path": "/tmp/photos/IMG_0151.jpeg", "caption": "a cat laying on a red blanket"},
{"path": "/tmp/photos/IMG_3099.MOV", "error": "cannot identify image file '/tmp/photos/IMG_3099.MOV'"}]
Any errors are returned as a {"path": "...", "error": "error message"}
object.
Development
To set up this plugin locally, first checkout the code. Then create a new virtual environment:
cd blip-caption
python3 -m venv venv
source venv/bin/activate
Now install the dependencies and test dependencies:
pip install -e '.[test]'
To run the tests:
pytest
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.