Few-shot classifier for detecting eye imaging datasets
Project description
envision-classifier
SetFit few-shot classifier for identifying eye imaging datasets from scientific metadata.
Part of the EyeACT project by the FAIR Data Innovations Hub.
Installation
pip install envision-classifier
Python API
from envision_classifier import EyeImagingClassifier
# Downloads model from HuggingFace on first use
clf = EyeImagingClassifier()
# Classify a single record
result = clf.classify("Retinal OCT dataset for diabetic retinopathy")
print(result)
# {'label': 'EYE_IMAGING', 'confidence': 0.999, 'probabilities': {...}}
# Classify a batch
results = clf.classify_batch([
"Retinal fundus photography dataset for glaucoma screening",
"COVID-19 genome sequencing data",
{"title": "OCT images", "description": "Macular degeneration scans"},
])
# Use a local model instead of downloading
clf = EyeImagingClassifier(model_path="./my_model")
CLI
After installing, the envision-classifier command is available:
# Classify a text string
envision-classifier classify --text "Retinal OCT dataset for diabetic retinopathy"
# Classify from a JSON file
envision-classifier classify records.json
# Pipe JSON via stdin
echo '{"title": "Fundus images", "description": "DR screening"}' | envision-classifier classify
# Train a new model from built-in training data
envision-classifier train --output ./my_model
# Show model info and training data counts
envision-classifier info
Classification Labels
| Label | Description |
|---|---|
| EYE_IMAGING | Actual eye imaging datasets (fundus, OCT, OCTA, cornea) |
| EYE_SOFTWARE | Code, tools, models for eye imaging (no actual data) |
| EDGE_CASE | Eye research papers, reviews, non-imaging data |
| NEGATIVE | Not eye-related |
Model
- Base model:
sentence-transformers/all-mpnet-base-v2(768-dim) - Training data: 474 curated examples (77 EYE_IMAGING, 48 EYE_SOFTWARE, 79 EDGE_CASE, 270 NEGATIVE)
- Test accuracy: 0.937, macro F1: 0.902
- Spot-check: 29/33 (87.9%)
- Model weights: fairdataihub/envision-eye-imaging-classifier
Related
- envision-discovery -- Full pipeline (scraping + classification + export)
- Model on HuggingFace
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
envision_classifier-0.1.1.tar.gz
(19.2 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file envision_classifier-0.1.1.tar.gz.
File metadata
- Download URL: envision_classifier-0.1.1.tar.gz
- Upload date:
- Size: 19.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.3.2 CPython/3.12.12 Linux/6.14.0-1017-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dbacf6d5b089727ebc6afe6181c094e1886957248cd11c7c81fbc861f85811c0
|
|
| MD5 |
b2a048ef00261522dd7e682658397d73
|
|
| BLAKE2b-256 |
ac4ff64bca2479af3d351ce3d46770b32f90c7d69fe2b3bce806e77c8f5b08b7
|
File details
Details for the file envision_classifier-0.1.1-py3-none-any.whl.
File metadata
- Download URL: envision_classifier-0.1.1-py3-none-any.whl
- Upload date:
- Size: 20.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.3.2 CPython/3.12.12 Linux/6.14.0-1017-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5b8aec7a51a8f4033c4d2bebb57ed01f5842eacf0805f840e57e2657ee27de24
|
|
| MD5 |
f65b24caf611abe34d505e45822f8490
|
|
| BLAKE2b-256 |
be36a38b438011a71fbc584ebde4ca17243cf94fef70848259ed9955f8150a84
|