Few-shot CLIP classification with conformal prediction and calibrated probabilities

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

fmegahed

These details have not been verified by PyPI

Project links

Documentation

Project description

conformal_clip

Few-shot vision classification with conformal prediction and optional probability calibration.

This package runs CLIP-like models via open-clip-torch (any model in Hugging Face OpenCLIP) and also supports timm vision-only models for few-shot conformal prediction (Hugging Face timm models).

Features

Backends: OpenCLIP (CLIP-like, vision+text) and timm (vision-only)
Few-shot classification using exemplar image banks
Conformal prediction (global and Mondrian) with finite-sample coverage
Optional calibration (isotonic or sigmoid/Platt)
Zero-shot baseline for CLIP-like models
Benchmark utility to compare backends and settings across the same splits

Backends overview

Backends are provided in two broad categories: CLIP-like vision–language encoders (via open-clip-torch) and image-only encoders (via timm). All of them can be loaded with load_backend(backend, backend_model_id, device); the recommendations below are approximate and assume small batch sizes and fp16 where possible.

CLIP-like (vision–language) encoders

Family	Backend key	OpenCLIP model id	Recommended environment	Notes
ViT (small)	`clip_b32`	`ViT-B-32-quickgelu`	Low–medium (4–8 GB GPU or CPU)	Lightest ViT CLIP; good starting point when resources are tight.
ViT (small)	`clip_b16`	`ViT-B-16-quickgelu`	Medium (≥8 GB GPU)	More accurate than B/32 at modest extra cost.
ViT (base)	`siglip2`	`ViT-B-16-SigLIP2`	Medium (≥8 GB GPU)	ViT-B model with SigLIP2 loss; strong trade-off between accuracy and cost.
ViT (mobile)	`mobileclip2`	`MobileCLIP2-S4`	Low (CPU, 4–8 GB GPU, edge)	Mobile-optimized CLIP; preferred for low-power or edge deployments.
ViT (large)	`openai`	`ViT-L-14-quickgelu`	Medium–high (≥8–12 GB GPU)	Classic CLIP baseline; strong general performance.
ViT (xlarge)	`openclipbase`	`ViT-H-14-quickgelu`	High (≥16 GB GPU)	Larger ViT-H encoder; use when memory is ample.
ViT (giant)	`vitg`	`ViT-bigG-14`	Very high (≥24 GB GPU)	Extremely large model; for offline or benchmark use only.
ResNet CLIP	`resnet50`	`RN50x64-quickgelu`	Medium (≥8 GB GPU or strong CPU)	Deep CNN CLIP; useful as a non-ViT baseline.
EVA family	`eva02`	`EVA02-E-14-plus`	High (≥16–24 GB GPU)	High-capacity ViT-style model; heavy but strong.
ConvNeXt	`convnext`	`convnext_xxlarge`	Very high (≥24 GB GPU)	Very large ConvNeXt CLIP; avoid on small GPUs.
CoCa	`coca`	`coca_ViT-L-14`	High (≥12–16 GB GPU)	Captioning-oriented CLIP variant; strong but memory-hungry.

Image-only (vision encoders via timm)

Family	Backend key	timm model id	Recommended environment	Notes
Lightweight	`mobilenetv4`	`mobilenetv4_conv_aa_large.e230_r448_in12k_ft_in1k`	Low (CPU, 4–8 GB GPU)	Very efficient mobile-style CNN; best when resources are tight.
Lightweight	`resnet18`	`resnet18.a1_in1k`	Low (CPU, 4–8 GB GPU)	Classic small ResNet; easy to run and debug.
Lightweight	`efficientnet_b0`	`efficientnet_b0.ra_in1k`	Low–medium (CPU, 4–8 GB GPU)	Strong accuracy/efficiency balance among small CNNs.
ViT-L	`dinov3`	`vit_large_patch16_dinov3.lvd1689m`	Medium–high (≥8–16 GB GPU)	Self-supervised ViT-Large; heavy but strong general-purpose features.

Install

Core package:

pip install conformal-clip

With example dataset (textile images):

pip install "conformal-clip[data]"

Notes:

Set HF_TOKEN in your environment (or a .env file) if you need access to gated models (e.g., DINOv3). The loader forwards it to HUGGINGFACE_HUB_TOKEN.
PyTorch with CUDA is recommended for speed but not required.

Environment Setup

Hugging Face token (for gated repos like some DINOv3 builds):
- In shell: export HF_TOKEN=hf_... (Linux/macOS) or set HF_TOKEN=hf_... (Windows)
- Or create a .env file next to your script with HF_TOKEN=hf_....
- The loader maps HF_TOKEN to HUGGINGFACE_HUB_TOKEN automatically.
CUDA (optional): Install a CUDA-enabled PyTorch build from PyTorch Get Started (Locally) then use device = torch.device("cuda").

Quickstart

Load a backend and run few-shot + conformal:

import torch
from conformal_clip import load_backend, few_shot_fault_classification_conformal

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model, preprocess_fn, _ = load_backend("openclipbase", None, device)

# Prepare PIL -> tensor using preprocess_fn, then call few_shot_fault_classification_conformal
# nominal_images, defective_images, calib_images, test_images: lists of preprocessed tensors
# calib_labels: list[str] of ground-truth for calibration
results = few_shot_fault_classification_conformal(
    model=model,
    test_images=test_images,
    test_image_filenames=test_filenames,
    nominal_images=nominal_images,
    nominal_descriptions=["..."] * len(nominal_images),
    defective_images=defective_images,
    defective_descriptions=["..."] * len(defective_images),
    calib_images=calib_images,
    calib_labels=calib_labels,
    alpha=0.1,
    mondrian=True,
    prob_calibration="isotonic",  # or "sigmoid" or None
)

Zero-shot (CLIP-like backends only):

from conformal_clip import evaluate_zero_shot_predictions

model, preprocess_fn, tokenize_fn = load_backend("openai", None, device)
metrics_df, results_df = evaluate_zero_shot_predictions(
    labels=my_labels,
    label_counts=my_label_counts,
    test_images=[preprocess_fn(pil) for pil in my_pil_images],
    test_image_filenames=my_filenames,
    model=model,
    device=device,
    tokenize_fn=tokenize_fn,
    save_confusion_matrix=True,
)

Benchmark across backends, calibration and conformal settings:

from conformal_clip import benchmark_models

cls_df, cp_df, cls_style, cp_style = benchmark_models(
    train_nominal_images=pil_nom_bank,
    train_defective_images=pil_def_bank,
    calib_images=pil_calib,
    calib_labels=calib_labels,
    test_images=pil_test,
    test_labels=test_labels,
    device=device,
    seed=2025,
    # resource_tier controls which backends are run when backends is None.
    # Defaults to "low" (small models suitable for low-resource environments).
    resource_tier="low",  # or "medium" or "high"
    calibration_methods=(None, "isotonic", "sigmoid"),
    conformal_modes=(None, "global", "mondrian"),
    alpha_list=(0.1,),
)

Resource warning: many CLIP/timm backbones are large (hundreds of MB to multiple GB per model). Running benchmark_models over many backends can require tens of gigabytes of RAM and substantial disk cache space. We do not recommend running the full benchmark configuration on machines with limited memory or storage; instead, restrict the backends argument to a small subset of models.

Prepare Images

Using the example textile dataset (installed via conformal-clip[data]):

import os
from PIL import Image
from conformal_clip_data import nominal_dir, local_dir, global_dir

def list_paths(d):
    exts = {".jpg", ".jpeg", ".png"}
    return [os.path.join(d, f) for f in os.listdir(d) if os.path.splitext(f)[1].lower() in exts]

nominal_paths = list_paths(nominal_dir())
local_paths = list_paths(local_dir())
global_paths = list_paths(global_dir())

# Few-shot banks (exemplars)
pil_nom_bank = [Image.open(p).convert("RGB") for p in nominal_paths[:50]]
pil_def_bank = [Image.open(p).convert("RGB") for p in (local_paths[:25] + global_paths[:25])]

# Calibration and test (example split)
pil_calib = [Image.open(p).convert("RGB") for p in (nominal_paths[50:100] + local_paths[25:50] + global_paths[25:50])]
calib_labels = ["Nominal"] * 50 + ["Defective"] * 50
pil_test = [Image.open(p).convert("RGB") for p in (nominal_paths[100:150] + local_paths[50:75] + global_paths[50:75])]
test_labels = ["Nominal"] * 50 + ["Defective"] * 50

# Preprocess to tensors for a given backend
# model, preprocess_fn, _ = load_backend("openclipbase", None, device)
nominal_images = [preprocess_fn(img) for img in pil_nom_bank]
defective_images = [preprocess_fn(img) for img in pil_def_bank]
calib_images = [preprocess_fn(img) for img in pil_calib]
test_images = [preprocess_fn(img) for img in pil_test]

Using your own local folders (commented template):

# import os
# from PIL import Image
#
# base_dir = "./data/textile_images/simulated"
# nominal_dir = os.path.join(base_dir, "nominal")
# defective_dir = os.path.join(base_dir, "defective")
#
# def list_paths(d):
#     exts = {".jpg", ".jpeg", ".png"}
#     return [os.path.join(d, f) for f in os.listdir(d) if os.path.splitext(f)[1].lower() in exts]
#
# nominal_paths = list_paths(nominal_dir)
# defective_paths = list_paths(defective_dir)
#
# pil_nom_bank = [Image.open(p).convert("RGB") for p in nominal_paths[:50]]
# pil_def_bank = [Image.open(p).convert("RGB") for p in defective_paths[:50]]
# pil_calib = [Image.open(p).convert("RGB") for p in (nominal_paths[50:100] + defective_paths[50:100])]
# calib_labels = ["Nominal"] * 50 + ["Defective"] * 50
# pil_test = [Image.open(p).convert("RGB") for p in (nominal_paths[100:150] + defective_paths[100:150])]
# test_labels = ["Nominal"] * 50 + ["Defective"] * 50
#
# model, preprocess_fn, _ = load_backend("openclipbase", None, device)
# nominal_images = [preprocess_fn(img) for img in pil_nom_bank]
# defective_images = [preprocess_fn(img) for img in pil_def_bank]
# calib_images = [preprocess_fn(img) for img in pil_calib]
# test_images = [preprocess_fn(img) for img in pil_test]

Examples

See the examples/ folder:

textile_*.py: Run each backend on the textile dataset (installed via conformal-clip[data]). Each script also shows how to adapt to your own data (commented snippet).
benchmark_textile.py: End-to-end benchmark and HTML output with highlighted best metrics.

Discover and Use Custom OpenCLIP Models

You can list available model names (and their pretrained tags) directly from open_clip. These names can be used with the custom-clip backend (and you may optionally provide a specific tag via the model@tag format). For Hugging Face repo IDs in the OpenCLIP ecosystem, use custom-clip-hf.

Browse OpenCLIP models on Hugging Face: Hugging Face OpenCLIP models

List available built-in model names and their tags:

import open_clip

# Dict[str, Set[str]] mapping model_name -> available pretrained tags
by_model = open_clip.list_pretrained_tags_by_model()

print("Some available model names (built-ins):")
for name in sorted(by_model.keys())[:20]:
    print(" ", name)

# Inspect tags for a specific model
model_name = "ViT-L-14-quickgelu"
print("Available pretrained tags for", model_name, ":", sorted(by_model.get(model_name, [])))

Load a custom built-in model (optionally specify a tag):

from conformal_clip import load_backend
device = "cuda"  # or "cpu"

# Using just the model name (the loader will pick a sensible pretrained tag)
model, preprocess_fn, tokenize_fn = load_backend(
    backend="custom-clip", backend_model_id="ViT-L-14-quickgelu", device=device
)

# Or explicitly specify a tag using the "model@tag" form
model, preprocess_fn, tokenize_fn = load_backend(
    backend="custom-clip", backend_model_id="ViT-L-14-quickgelu@openai", device=device
)

Load from Hugging Face (OpenCLIP-compatible weights) using custom-clip-hf:

# Example: an hf-hub repo id (ensure you have access if it’s gated)
model, preprocess_fn, tokenize_fn = load_backend(
    backend="custom-clip-hf", backend_model_id="hf-hub:org-or-user/repo-id", device=device
)

Browse timm models on Hugging Face: Hugging Face timm models

List timm model names programmatically (vision-only backends):

import timm

# All models that have pretrained weights available
names = timm.list_models(pretrained=True)
print(f"Found {len(names)} pretrained timm models")
print("First 20:", names[:20])

# Filter by family/pattern (e.g., mobilenet*, convnext*)
print("mobilenet*:", timm.list_models("mobilenet*", pretrained=True)[:10])
print("convnext*:", timm.list_models("convnext*", pretrained=True)[:10])

Project Structure

conformal_clip/
├── conformal_clip/          # Main package
│   ├── __init__.py          # Package initialization and public API
│   ├── backends.py          # Model loading for OpenCLIP and timm backends
│   ├── wrappers.py          # CLIPWrapper: sklearn-compatible classifier
│   ├── conformal.py         # Conformal prediction (global and Mondrian)
│   ├── zero_shot.py         # Zero-shot evaluation for CLIP-like models
│   ├── metrics.py           # Classification and conformal metrics
│   ├── benchmark.py         # Systematic benchmarking across backends
│   └── viz.py               # Visualization utilities (confusion matrices)
├── examples/                # Example scripts
│   ├── textile_mobileclip2.py      # Few-shot with MobileCLIP2
│   ├── textile_mobilenetv4.py      # Few-shot with MobileNetV4 (timm)
│   ├── zero_shot_openclip.py       # Zero-shot classification example
│   ├── benchmark_textile.py        # Full benchmarking suite
│   ├── custom_openclip_example.py  # Using custom OpenCLIP models
│   ├── list_models_openclip_timm.py # Discover available models
│   └── _shared_textile.py          # Shared data loading utilities
└── tests/                   # Unit and integration tests

Citation

If you use this package in your research, please cite:

@misc{megahed2025adaptingopenaisclipmodel,
  title={Adapting OpenAI's CLIP Model for Few-Shot Image Inspection in Manufacturing Quality Control: An Expository Case Study with Multiple Application Examples},
  author={Fadel M. Megahed and Ying-Ju Chen and Bianca Maria Colosimo and Marco Luigi Giuseppe Grasso and L. Allison Jones-Farmer and Sven Knoth and Hongyue Sun and Inez Zwetsloot},
  year={2025},
  eprint={2501.12596},
  archivePrefix={arXiv},
  primaryClass={cs.CV},
  url={https://arxiv.org/abs/2501.12596}
}

License

MIT License (see LICENSE).

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

fmegahed

These details have not been verified by PyPI

Project links

Documentation

Release history Release notifications | RSS feed

This version

0.2.1

Nov 22, 2025

0.2.0

Nov 16, 2025

0.1.1

Nov 7, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

conformal_clip-0.2.1.tar.gz (37.0 kB view details)

Uploaded Nov 22, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

conformal_clip-0.2.1-py3-none-any.whl (34.2 kB view details)

Uploaded Nov 22, 2025 Python 3

File details

Details for the file conformal_clip-0.2.1.tar.gz.

File metadata

Download URL: conformal_clip-0.2.1.tar.gz
Upload date: Nov 22, 2025
Size: 37.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for conformal_clip-0.2.1.tar.gz
Algorithm	Hash digest
SHA256	`fd0db44a859478959fd9ab6d2b25778f1713341a85a89900b922a49904f22b8c`
MD5	`4d43a62a603ae4a70715c96945bb413e`
BLAKE2b-256	`339ecdf10048648e9fc89ce06ae9fe31d0548489f4e95e9a0d2e9bf6bf4adff2`

See more details on using hashes here.

Provenance

The following attestation bundles were made for conformal_clip-0.2.1.tar.gz:

Publisher: publish-to-pypi.yml on fmegahed/conformal_clip

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: conformal_clip-0.2.1.tar.gz
- Subject digest: fd0db44a859478959fd9ab6d2b25778f1713341a85a89900b922a49904f22b8c
- Sigstore transparency entry: 715751826
- Sigstore integration time: Nov 22, 2025
Source repository:
- Permalink: fmegahed/conformal_clip@e953cff9d481d33d9f1a2146af2d4fd958f85355
- Branch / Tag: refs/tags/v0.2.1
- Owner: https://github.com/fmegahed
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-to-pypi.yml@e953cff9d481d33d9f1a2146af2d4fd958f85355
- Trigger Event: release

File details

Details for the file conformal_clip-0.2.1-py3-none-any.whl.

File metadata

Download URL: conformal_clip-0.2.1-py3-none-any.whl
Upload date: Nov 22, 2025
Size: 34.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for conformal_clip-0.2.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`12d15b4d3c4041924cd2e69f3397e19d20a3470a0fc78b3b04833a4a06f3c6c1`
MD5	`0f4990254c23642f6744d0ca4be91868`
BLAKE2b-256	`e98988f3fe715df2a3376b13c4a2a955ec28a6c658e19040d57f022c4d1909d0`

See more details on using hashes here.

Provenance

The following attestation bundles were made for conformal_clip-0.2.1-py3-none-any.whl:

Publisher: publish-to-pypi.yml on fmegahed/conformal_clip

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: conformal_clip-0.2.1-py3-none-any.whl
- Subject digest: 12d15b4d3c4041924cd2e69f3397e19d20a3470a0fc78b3b04833a4a06f3c6c1
- Sigstore transparency entry: 715751827
- Sigstore integration time: Nov 22, 2025
Source repository:
- Permalink: fmegahed/conformal_clip@e953cff9d481d33d9f1a2146af2d4fd958f85355
- Branch / Tag: refs/tags/v0.2.1
- Owner: https://github.com/fmegahed
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-to-pypi.yml@e953cff9d481d33d9f1a2146af2d4fd958f85355
- Trigger Event: release

conformal-clip 0.2.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

conformal_clip

Features

Backends overview

Install

Environment Setup

Quickstart

Prepare Images

Examples

Discover and Use Custom OpenCLIP Models

Project Structure

Citation

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance