Python package that simplifies using the BioCLIP foundation model.
Project description
pybioclip
Command line tool and python package to simplify using BioCLIP, including for taxonomic or other label prediction on (and thus annotation or labeling of) images, as well as for generating semantic embeddings for images. No particular understanding of ML or computer vision is required to use it. It also implements a number of performance optimizations for batches of images or custom class lists, which should be particularly useful for integration into computational workflows.
Table of Contents
Requirements
- Python compatible with PyTorch
Installation
pip install pybioclip
If you have any issues with installation, please first upgrade pip by running pip install --upgrade pip
.
Python Package Usage
Example Notebooks
- Predict species for images - examples/PredictImages.ipynb
- Predict species for iNaturalist images - examples/iNaturalistPredict.ipynb
Predict species classification
from bioclip import TreeOfLifeClassifier, Rank
classifier = TreeOfLifeClassifier()
predictions = classifier.predict("Ursus-arctos.jpeg", Rank.SPECIES)
for prediction in predictions:
print(prediction["species"], "-", prediction["score"])
Output:
Ursus arctos - 0.9356034994125366
Ursus arctos syriacus - 0.05616999790072441
Ursus arctos bruinosus - 0.004126196261495352
Ursus arctus - 0.0024959812872111797
Ursus americanus - 0.0005009894957765937
Output from the predict()
method showing the dictionary structure:
[{
'kingdom': 'Animalia',
'phylum': 'Chordata',
'class': 'Mammalia',
'order': 'Carnivora',
'family': 'Ursidae',
'genus': 'Ursus',
'species_epithet': 'arctos',
'species': 'Ursus arctos',
'common_name': 'Kodiak bear'
'score': 0.9356034994125366
}]
The output from the predict function can be converted into a pandas DataFrame like so:
import pandas as pd
from bioclip import TreeOfLifeClassifier, Rank
classifier = TreeOfLifeClassifier()
predictions = classifier.predict("Ursus-arctos.jpeg", Rank.SPECIES)
df = pd.DataFrame(predictions)
The first argument of the predict()
method supports both a single path or a list of paths.
Predict from a list of classes
from bioclip import CustomLabelsClassifier
classifier = CustomLabelsClassifier(["duck","fish","bear"])
predictions = classifier.predict("Ursus-arctos.jpeg")
for prediction in predictions:
print(prediction["classification"], prediction["score"])
Output:
duck 1.0306726583309e-09
fish 2.932403668845507e-12
bear 1.0
Command Line Usage
bioclip predict [-h] [--format {table,csv}] [--output OUTPUT] [--rank {kingdom,phylum,class,order,family,genus,species}] [--k K] [--cls CLS] [--device DEVICE] image_file [image_file ...]
bioclip embed [-h] [--device=DEVICE] [--output=OUTPUT] [IMAGE_FILE...]
Commands:
predict Use BioCLIP to generate predictions for image files.
embed Use BioCLIP to generate embeddings for image files.
Arguments:
IMAGE_FILE input image file
Options:
-h --help
--format=FORMAT format of the output (table or csv) for predict mode [default: csv]
--rank=RANK rank of the classification (kingdom, phylum, class, order, family, genus, species) [default: species]
--k=K number of top predictions to show [default: 5]
--cls=CLS classes to predict: either a comma separated list or a path to a text file of classes (one per line), when specified the --rank argument is not allowed.
--device=DEVICE device to use matrix math (cpu or cuda or mps) [default: cpu]
--output=OUTFILE print output to file OUTFILE [default: stdout]
Predict classification
Predict species for an image
The example images used below are Ursus-arctos.jpeg
and Felis-catus.jpeg
both from the bioclip-demo.
Predict species for an Ursus-arctos.jpeg
file:
bioclip predict Ursus-arctos.jpeg
Output:
bioclip predict Ursus-arctos.jpeg
file_name,kingdom,phylum,class,order,family,genus,species_epithet,species,common_name,score
Ursus-arctos.jpeg,Animalia,Chordata,Mammalia,Carnivora,Ursidae,Ursus,arctos,Ursus arctos,Kodiak bear,0.9356034994125366
Ursus-arctos.jpeg,Animalia,Chordata,Mammalia,Carnivora,Ursidae,Ursus,arctos syriacus,Ursus arctos syriacus,syrian brown bear,0.05616999790072441
Ursus-arctos.jpeg,Animalia,Chordata,Mammalia,Carnivora,Ursidae,Ursus,arctos bruinosus,Ursus arctos bruinosus,,0.004126196261495352
Ursus-arctos.jpeg,Animalia,Chordata,Mammalia,Carnivora,Ursidae,Ursus,arctus,Ursus arctus,,0.0024959812872111797
Ursus-arctos.jpeg,Animalia,Chordata,Mammalia,Carnivora,Ursidae,Ursus,americanus,Ursus americanus,Louisiana black bear,0.0005009894957765937
Predict species for multiple images saving to a file
To make predictions for files Ursus-arctos.jpeg
and Felis-catus.jpeg
saving the output to a file named predictions.csv
:
bioclip predict --output predictions.csv Ursus-arctos.jpeg Felis-catus.jpeg
The contents of predictions.csv
will look like this:
file_name,kingdom,phylum,class,order,family,genus,species_epithet,species,common_name,score
Ursus-arctos.jpeg,Animalia,Chordata,Mammalia,Carnivora,Ursidae,Ursus,arctos,Ursus arctos,Kodiak bear,0.9356034994125366
Ursus-arctos.jpeg,Animalia,Chordata,Mammalia,Carnivora,Ursidae,Ursus,arctos syriacus,Ursus arctos syriacus,syrian brown bear,0.05616999790072441
Ursus-arctos.jpeg,Animalia,Chordata,Mammalia,Carnivora,Ursidae,Ursus,arctos bruinosus,Ursus arctos bruinosus,,0.004126196261495352
Ursus-arctos.jpeg,Animalia,Chordata,Mammalia,Carnivora,Ursidae,Ursus,arctus,Ursus arctus,,0.0024959812872111797
Ursus-arctos.jpeg,Animalia,Chordata,Mammalia,Carnivora,Ursidae,Ursus,americanus,Ursus americanus,Louisiana black bear,0.0005009894957765937
Felis-catus.jpeg,Animalia,Chordata,Mammalia,Carnivora,Felidae,Felis,silvestris,Felis silvestris,European Wildcat,0.7221033573150635
Felis-catus.jpeg,Animalia,Chordata,Mammalia,Carnivora,Felidae,Felis,catus,Felis catus,Domestic Cat,0.19810837507247925
Felis-catus.jpeg,Animalia,Chordata,Mammalia,Carnivora,Felidae,Felis,margarita,Felis margarita,Sand Cat,0.02798456884920597
Felis-catus.jpeg,Animalia,Chordata,Mammalia,Carnivora,Felidae,Lynx,felis,Lynx felis,,0.021829601377248764
Felis-catus.jpeg,Animalia,Chordata,Mammalia,Carnivora,Felidae,Felis,bieti,Felis bieti,Chinese desert cat,0.010979168117046356
Predict top 3 genera for an image and display output as a table
bioclip predict --format table --k 3 --rank=genus Ursus-arctos.jpeg
Output:
+-------------------+----------+----------+----------+--------------+----------+--------+------------------------+
| file_name | kingdom | phylum | class | order | family | genus | score |
+-------------------+----------+----------+----------+--------------+----------+--------+------------------------+
| Ursus-arctos.jpeg | Animalia | Chordata | Mammalia | Carnivora | Ursidae | Ursus | 0.9994320273399353 |
| Ursus-arctos.jpeg | Animalia | Chordata | Mammalia | Artiodactyla | Cervidae | Cervus | 0.00032594642834737897 |
| Ursus-arctos.jpeg | Animalia | Chordata | Mammalia | Artiodactyla | Cervidae | Alces | 7.803700282238424e-05 |
+-------------------+----------+----------+----------+--------------+----------+--------+------------------------+
Predict from a list of classes
Create predictions for 3 classes (cat, bird, and bear) for image Ursus-arctos.jpeg
:
bioclip predict --cls cat,bird,bear Ursus-arctos.jpeg
Output:
file_name,classification,score
Ursus-arctos.jpeg,cat,4.581644930112816e-08
Ursus-arctos.jpeg,bird,3.051998476166773e-08
Ursus-arctos.jpeg,bear,0.9999998807907104
Create embeddings
Create embedding for an image
bioclip embed Ursus-arctos.jpeg
Output:
{
"model": "hf-hub:imageomics/bioclip",
"embeddings": {
"Ursus-arctos.jpeg": [
-0.23633578419685364,
-0.28467196226119995,
-0.4394485652446747,
...
]
}
}
View command line help
bioclip --help
Additional Documentation
See pybioclip wiki documentation for additional documentation.
- Using the pybioclip docker container
- Using the pybioclip apptainer/singularity container
- Using a custom model
License
pybioclip
is distributed under the terms of the MIT license.
Acknowledgments
The prediction code in this repo is based on work by @samuelstevens in bioclip-demo.
Citation
Our code (this repository):
@software{Bradley_pybioclip_2024,
author = {Bradley, John and Lapp, Hilmar and Campolongo, Elizabeth G.},
doi = {10.5281/zenodo.13151194},
month = jul,
title = {{pybioclip}},
version = {1.0.0},
year = {2024}
}
BioCLIP paper:
@inproceedings{stevens2024bioclip,
title = {{B}io{CLIP}: A Vision Foundation Model for the Tree of Life},
author = {Samuel Stevens and Jiaman Wu and Matthew J Thompson and Elizabeth G Campolongo and Chan Hee Song and David Edward Carlyn and Li Dong and Wasila M Dahdul and Charles Stewart and Tanya Berger-Wolf and Wei-Lun Chao and Yu Su},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2024}
}
Also consider citing the BioCLIP code:
@software{bioclip2023code,
author = {Samuel Stevens and Jiaman Wu and Matthew J. Thompson and Elizabeth G. Campolongo and Chan Hee Song and David Edward Carlyn},
doi = {10.5281/zenodo.10895871},
title = {BioCLIP},
version = {v1.0.0},
year = {2024}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pybioclip-1.1.0.tar.gz
.
File metadata
- Download URL: pybioclip-1.1.0.tar.gz
- Upload date:
- Size: 2.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ab92a9c46f2fb04a7cfb36fcfb9e529ff662ddb5146a3a79afef8eac68d75dc5 |
|
MD5 | f8f091858fbf5573ea3af4b28f0262fc |
|
BLAKE2b-256 | 6a5a7a74aaf8c2ff5c56c635b9a3b05acb3193e7a8e1982c219cd9ad916cc7fc |
File details
Details for the file pybioclip-1.1.0-py3-none-any.whl
.
File metadata
- Download URL: pybioclip-1.1.0-py3-none-any.whl
- Upload date:
- Size: 12.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a7c281920b4afc649a14df51655aa0964110a065f6710d292d6de2e257d1cbd4 |
|
MD5 | 2626aa57f8d367c8d7704713ab560578 |
|
BLAKE2b-256 | 5f3d080c6ba31bafa052da0e9afd6f5f37ca27aaad5eb15fee500003ba837951 |