Skip to main content

Data bundle for conformal-clip examples and tests

Project description

conformal-clip-data

A companion data package providing benchmark datasets for the clip-conformal package.

Overview

This package bundles the simulated textile image dataset used in Megahed et al., 2025 for demonstrating conformal prediction with CLIP-based few-shot image classification in manufacturing quality control applications.

This is a data-only package designed to work seamlessly with the clip-conformal package (coming soon to PyPI), which provides the core implementation of conformal prediction methods for CLIP models. By separating data from implementation, we keep the main package lightweight while providing easy access to reproducible benchmark datasets.

Installation

Install directly from PyPI:

pip install conformal-clip-data

Or install from source:

git clone https://github.com/fmegahed/conformal-clip-data.git
cd conformal-clip-data
pip install -e .

Quick Start

from conformal_clip_data import get_data_path

# Access the textile dataset
data_path = get_data_path("textile")
print(f"Dataset location: {data_path}")

Dataset Provenance

These images were originally generated using the R script below and were previously released under an MIT License in our repository:

Dataset Summary

To systematically evaluate CLIP's performance on STS image classification, we used the spc4sts R package to create a controlled dataset of simulated textile fabric textures. This approach allowed us to precisely model both nominal and defective weave structures and to control defect type and severity.

Our dataset contains:

Class Description Count
Nominal Standard textile weave patterns 1,000
Local defects Localized disruptions in the weave 500
Global defects Systematic shifts in weave parameters 500

Each image is 250 × 250 px, generated using spc4sts recommended parameters:

  • Nominal images:
    Spatial autoregressive parameters ϕ₁ = 0.6, ϕ₂ = 0.35

  • Global defects:
    Both parameters reduced by 5%

  • Local defects:
    Generated using the package's defect-insertion functions

Relationship with clip-conformal

This data package is designed as a companion to the clip-conformal package, which will be released to PyPI shortly. The separation of concerns provides several benefits:

  • Lightweight installation: The clip-conformal package remains small and fast to install
  • Reproducibility: Benchmark datasets are versioned and distributed consistently
  • Extensibility: Additional datasets can be added without modifying the core package
  • Optional usage: Users can work with clip-conformal using their own data without downloading benchmark datasets

For the full implementation of conformal prediction methods for CLIP models and complete examples using this dataset, please install the clip-conformal package (coming soon).

Citation

If you use this dataset in your research, please cite:

@misc{megahed2025adaptingopenaisclipmodel,
      title={Adapting OpenAI's CLIP Model for Few-Shot Image Inspection in Manufacturing Quality Control: An Expository Case Study with Multiple Application Examples},
      author={Fadel M. Megahed and Ying-Ju Chen and Bianca Maria Colosimo and Marco Luigi Giuseppe Grasso and L. Allison Jones-Farmer and Sven Knoth and Hongyue Sun and Inez Zwetsloot},
      year={2025},
      eprint={2501.12596},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2501.12596},
}

And the original spc4sts package used to generate the textile images:

@article{bui2020spc4sts,
  title={spc4sts: Statistical process control for stochastic textured surfaces in R},
  author={Bui, Anh Tuan and Apley, Daniel W},
  journal={Journal of Quality Technology},
  volume={53},
  number={3},
  pages={219--242},
  year={2020},
  doi={10.1080/00224065.2019.1707730}
}

License

MIT License. These images were generated by the authors and are released under the MIT License. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

conformal_clip_data-0.1.1.tar.gz (4.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

conformal_clip_data-0.1.1-py3-none-any.whl (4.8 kB view details)

Uploaded Python 3

File details

Details for the file conformal_clip_data-0.1.1.tar.gz.

File metadata

  • Download URL: conformal_clip_data-0.1.1.tar.gz
  • Upload date:
  • Size: 4.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for conformal_clip_data-0.1.1.tar.gz
Algorithm Hash digest
SHA256 037ee85c215bd2011919a463206bc69855341acf00c0612f77d41702d2fae0a9
MD5 40f4cc83d78c0c07b6e479bc7d6c7e59
BLAKE2b-256 1e8f89ec29a222ab696739ad17c47261483087287635257f11d9ba09033a6e7d

See more details on using hashes here.

Provenance

The following attestation bundles were made for conformal_clip_data-0.1.1.tar.gz:

Publisher: publish-to-pypi.yml on fmegahed/conformal-clip-data

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file conformal_clip_data-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for conformal_clip_data-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 641cef639099b648f9a7cdb5a2bb42d1a185ba145b7790b7f1a49951b48c505b
MD5 52da1d36f88262c8609d51346f662bbf
BLAKE2b-256 e32cf3ea6f9060e4709a8fe29db25a9259e40dbca9589dd091587d7301b0014e

See more details on using hashes here.

Provenance

The following attestation bundles were made for conformal_clip_data-0.1.1-py3-none-any.whl:

Publisher: publish-to-pypi.yml on fmegahed/conformal-clip-data

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page