Few-shot classifier for detecting eye imaging datasets
Project description
envision-classifier
SetFit few-shot classifier for identifying eye imaging datasets from scientific metadata.
Part of the EyeACT project by the FAIR Data Innovations Hub.
Installation
pip install git+https://github.com/EyeACT/envision-classifier.git
Python API
from envision_classifier import EyeImagingClassifier
# Downloads model from HuggingFace on first use
clf = EyeImagingClassifier()
# Classify a single record
result = clf.classify("Retinal OCT dataset for diabetic retinopathy")
print(result)
# {'label': 'EYE_IMAGING', 'confidence': 0.999, 'probabilities': {...}}
# Classify a batch
results = clf.classify_batch([
"Retinal fundus photography dataset for glaucoma screening",
"COVID-19 genome sequencing data",
{"title": "OCT images", "description": "Macular degeneration scans"},
])
# Use a local model instead of downloading
clf = EyeImagingClassifier(model_path="./my_model")
CLI
After installing, the envision-classifier command is available:
# Classify a text string
envision-classifier classify --text "Retinal OCT dataset for diabetic retinopathy"
# Classify from a JSON file
envision-classifier classify records.json
# Pipe JSON via stdin
echo '{"title": "Fundus images", "description": "DR screening"}' | envision-classifier classify
# Train a new model from built-in training data
envision-classifier train --output ./my_model
# Show model info and training data counts
envision-classifier info
Classification Labels
| Label | Description |
|---|---|
| EYE_IMAGING | Actual eye imaging datasets (fundus, OCT, OCTA, cornea) |
| EYE_SOFTWARE | Code, tools, models for eye imaging (no actual data) |
| EDGE_CASE | Eye research papers, reviews, non-imaging data |
| NEGATIVE | Not eye-related |
Model
- Base model:
sentence-transformers/all-mpnet-base-v2(768-dim) - Training data: 474 curated examples (77 EYE_IMAGING, 48 EYE_SOFTWARE, 79 EDGE_CASE, 270 NEGATIVE)
- Test accuracy: 0.937, macro F1: 0.902
- Spot-check: 29/33 (87.9%)
- Model weights: fairdataihub/envision-eye-imaging-classifier
Related
- envision-discovery -- Full pipeline (scraping + classification + export)
- Model on HuggingFace
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
envision_classifier-0.1.0.tar.gz
(19.2 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file envision_classifier-0.1.0.tar.gz.
File metadata
- Download URL: envision_classifier-0.1.0.tar.gz
- Upload date:
- Size: 19.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.3.2 CPython/3.12.12 Linux/6.14.0-1017-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ede0d9e4067e894027dd445f7ca869c27bac214f38a61ed3adb9291b34ae96ab
|
|
| MD5 |
167d7b55237076cfe9591cebd0f9fe94
|
|
| BLAKE2b-256 |
ed26da792a743f0e4151e9c84f3531d13991d19fdc11ec52e6fa08b0b1fd020f
|
File details
Details for the file envision_classifier-0.1.0-py3-none-any.whl.
File metadata
- Download URL: envision_classifier-0.1.0-py3-none-any.whl
- Upload date:
- Size: 20.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.3.2 CPython/3.12.12 Linux/6.14.0-1017-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8d5ba0776e56ab9b2e73080b77330ece8a472556f59bad3c68eba414db4d2ec0
|
|
| MD5 |
b32bcf6b71266a39963ab39b23a99bb1
|
|
| BLAKE2b-256 |
bd4f53cf5a709f4e5f1f57748531193a2927c1fd58635f790597326b3a8514d3
|