Skip to main content

imgshape v4.0.0 (Atlas) โ€” Dataset intelligence layer: deterministic fingerprinting and decision-making for ML pipelines.

Project description

๐Ÿ“Š imgshape

Dataset Intelligence Layer for Computer Vision

v4.0.0 Atlas Edition

PyPI Version Python 3.8+ License Code Style FastAPI React


imgshape Atlas


Deterministic Dataset Fingerprinting & Intelligent Decision Making
Fingerprinting โ€ข Rule-Based Decisions โ€ข Explainable AI โ€ข Deployable Artifacts โ€ข Production Ready

๐ŸŒ Live Demo โ€ข Documentation โ€ข v4 Guide โ€ข Report Bug โ€ข Request Feature


๐Ÿš€ imgshape v4.0.0 (Atlas)

Atlas is a complete architectural redesign of imgshape, shifting from heuristic recommendations to deterministic dataset intelligence.

Core Capabilities

Feature Description
๐Ÿ”ฌ Deterministic Fingerprinting Stable, canonical dataset identities across runs and deployments
๐ŸŽฏ Rule-Based Decisions Explainable, traceable decisions with full reasoning
๐Ÿ“ Five-Profile System Spatial, Signal, Distribution, Quality, Semantic analysis
๐Ÿ“ฆ Deployable Artifacts CI-safe, version-controlled outputs for production
๐Ÿ”“ No Hidden Logic Every decision includes complete rationale and confidence
โš™๏ธ Framework Agnostic Works with PyTorch, TensorFlow, JAX, or plain NumPy

Why Atlas?

Before (v3): "This dataset looks good for ResNet50."
Now (v4 Atlas): "This dataset's fingerprint is imgshape://vision/photographic/high-entropy. For task=classification with priority=speed, we recommend MobileNetV3 because: [8 explicit reasons with metrics]."


โšก Quick Start

Installation

# Core package
pip install imgshape

# With web UI and full features
pip install "imgshape[full]"

Python API (v4)

from imgshape import Atlas

# Initialize the analyzer
atlas = Atlas()

# Analyze a dataset
result = atlas.analyze(
    dataset_path="path/to/images",
    task="classification",
    deployment="edge",
    priority="speed"
)

# Inspect results
print(f"Fingerprint: {result.fingerprint.dataset_uri}")
# Fingerprint: imgshape://vision/photographic/high-entropy

print(f"Recommended Model: {result.decisions['model_family'].selected}")
# Recommended Model: MobileNetV3

print(f"Reasoning: {result.decisions['model_family'].why}")
# Reasoning: [8 evidence points with metrics]

# Export for CI/CD
artifact = result.to_artifact()
artifact.save("dataset_analysis.json")

Command Line (v4)

# Generate fingerprint
imgshape --fingerprint path/to/images --format json

# Run full analysis
imgshape --atlas path/to/images --task classification --output analysis.json

# View decisions
imgshape --decisions path/to/images --priority speed --deployment edge

# Interactive web UI
imgshape --web
# Opens http://localhost:8080 with modern React interface

Web Interface

The imgshape web UI provides an interactive, modern interface for dataset analysis:

Live Demo: ๐ŸŒ imgshape.vercel.app

imgshape --web

Features:

  • ๐Ÿ“Š Real-time fingerprint generation and visualization
  • ๐ŸŽฏ Interactive decision explorer with full reasoning
  • ๐Ÿ“ˆ Dataset statistics dashboard
  • ๐Ÿ’พ Export analysis results (JSON, YAML, PDF)
  • ๐Ÿš€ Deploy artifacts directly from the UI

Dashboard UI


๐Ÿ—๏ธ Architecture

Core Components

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚         Atlas Orchestrator                       โ”‚
โ”‚  (Main coordination & result aggregation)        โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
             โ”‚
    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
    โ”‚        โ”‚        โ”‚
    โ–ผ        โ–ผ        โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚Finger- โ”‚ โ”‚Rules โ”‚ โ”‚Artifact โ”‚
โ”‚print   โ”‚ โ”‚Based โ”‚ โ”‚Generatorโ”‚
โ”‚Engine  โ”‚ โ”‚Decis-โ”‚ โ”‚         โ”‚
โ”‚        โ”‚ โ”‚ion   โ”‚ โ”‚         โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
    โ”‚        โ”‚         โ”‚
    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
             โ”‚
             โ–ผ
    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
    โ”‚Result Bundle   โ”‚
    โ”‚ - Fingerprint  โ”‚
    โ”‚ - Decisions    โ”‚
    โ”‚ - Artifacts    โ”‚
    โ”‚ - Confidence   โ”‚
    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Fingerprint Profiles

Every dataset receives a 5-dimensional fingerprint:

  1. Spatial Profile - Image dimensions, aspect ratios, scale distribution
  2. Signal Profile - Channel count, bit depth, dynamic range
  3. Distribution Profile - Entropy, skewness, color uniformity
  4. Quality Profile - Corruption rate, blur detection, noise estimation
  5. Semantic Profile - Inferred content type (faces, objects, aerial, medical, etc.)

๐ŸŽฏ Decision Domains

Atlas makes deterministic decisions across 8 domains:

Domain Examples
Model Family ResNet, MobileNet, ViT, EfficientNet, etc.
Input Dimensions 224x224, 512x512, or custom based on content
Preprocessing Normalization parameters, augmentation strategy
Batch Size Based on memory constraints and convergence
Optimizer Adam, SGD, AdamW based on dataset characteristics
Augmentation RandAugment, MixUp, Cutmix, intensity levels
Deployment Target CPU, GPU, Edge (TensorRT, ONNX), Mobile
Training Duration Early stopping patience, epoch count, callbacks

๐Ÿ“Š Example Analysis Output

{
  "fingerprint": {
    "dataset_uri": "imgshape://vision/photographic/high-entropy",
    "dataset_id": "sha256:abc123...",
    "sample_count": 50000,
    "spatial": {
      "resolution_class": "high",
      "aspect_ratio_variance": 0.23,
      "mean_dimensions": [1920, 1080]
    },
    "signal": {
      "channel_count": 3,
      "bit_depth": 8
    },
    "distribution": {
      "entropy": 7.84,
      "color_uniformity": 0.42
    },
    "quality": {
      "corruption_rate": 0.0,
      "blur_percentage": 3.2,
      "noise_estimate": "gaussian"
    },
    "semantic": {
      "inferred_type": "photographic",
      "confidence": 0.92
    }
  },
  "decisions": {
    "model_family": {
      "selected": "MobileNetV3",
      "confidence": 0.87,
      "why": [
        "Dataset has 50k images (suitable for efficient models)",
        "Spatial resolution is high (1920x1080 average)",
        "Photographic content with 0.23 aspect ratio variance",
        "Edge deployment prioritizes inference speed over accuracy",
        "MobileNetV3 offers 2.8x faster inference than ResNet50",
        "Maintains 91% of ResNet50 accuracy on ImageNet",
        "Works on CPU and mobile devices",
        "Recent architecture (2019) with good operator support"
      ],
      "alternatives": ["EfficientNetB1", "ResNet34"]
    },
    "input_dimensions": {
      "selected": [224, 224],
      "confidence": 0.95,
      "why": ["MobileNetV3 default", "High entropy favors standard sizes"]
    }
  },
  "artifacts": {
    "fingerprint_stable": true,
    "fingerprint_format": "v4",
    "export_formats": ["json", "yaml", "protobuf"]
  }
}

๐Ÿ’ป Usage Patterns

1. CI/CD Integration

#!/bin/bash
# ci_check.sh - Ensure dataset integrity in your pipeline

imgshape --fingerprint data/train \
  --output fingerprint.json \
  --format json

# Compare with expected fingerprint
CURRENT=$(cat fingerprint.json | jq -r .dataset_id)
EXPECTED=$(cat .fingerprint_lock)

if [ "$CURRENT" != "$EXPECTED" ]; then
  echo "โŒ Dataset changed! Update .fingerprint_lock"
  exit 1
fi

echo "โœ… Dataset verified"

2. Training Script Integration

from imgshape import Atlas

# In your training pipeline
atlas = Atlas()
analysis = atlas.analyze("data/train", task="classification")

# Use recommendations
model = create_model(
    architecture=analysis.decisions['model_family'].selected,
    input_size=analysis.decisions['input_dimensions'].selected
)

augmentation = get_augmentation_pipeline(
    analysis.decisions['augmentation'].selected
)

print(f"Fingerprint: {analysis.fingerprint.dataset_uri}")
print(f"Model: {model.__class__.__name__}")

3. Manual Inspection

# Generate comprehensive report
imgshape --atlas data/train \
  --task detection \
  --deployment gpu \
  --priority accuracy \
  --report analysis_report.md

# View decisions
imgshape --decisions data/train \
  --output decisions.json \
  --verbose

๐Ÿ”Œ Plugin System

Extend imgshape with custom fingerprint extractors and decision rules.

# plugins/medical_profiler.py
from imgshape.plugins import FingerprintPlugin

class MedicalProfiler(FingerprintPlugin):
    """Extract DICOM-specific attributes"""
    
    NAME = "medical_profiler_v1"
    
    def extract(self, dataset_path):
        # Custom logic for medical imaging
        return {
            "modality": "CT",
            "bit_depth": 16,
            "is_3d": True
        }

Register and use:

imgshape --plugin-add plugins/medical_profiler.py
imgshape --fingerprint medical_data/ --plugin medical_profiler_v1

๐Ÿ“ฆ Installation Options

# Core (minimal dependencies)
pip install imgshape

# With PyTorch support
pip install "imgshape[torch]"

# With web UI (FastAPI + React)
pip install "imgshape[web]"

# With all features
pip install "imgshape[full]"

# Development (with testing tools)
pip install "imgshape[dev]"

๐Ÿงช Testing

Run the comprehensive test suite:

# Install dev dependencies
pip install "imgshape[dev]"

# Run all tests
pytest tests/ -v

# Run v4 specific tests
pytest tests/test_fingerprint.py tests/test_decision_engine.py -v

# Coverage
pytest --cov=imgshape tests/

Expected output: 26/33 passing (7 optional artifact tests)


๐ŸŒ Web Service

Deploy imgshape as a REST API:

# Start the service
imgshape --web

# API Endpoints (v4)
# POST /v4/fingerprint    - Get dataset fingerprint
# POST /v4/decisions      - Get decisions for a dataset
# POST /v4/analyze        - Full analysis
# GET  /health            - Service health

# Legacy Endpoints (v3)
# POST /analyze           - v3 analyze
# POST /recommend         - v3 recommendations

Docker Deployment

# Build
docker build -t imgshape:4.0.0 .

# Run
docker run -p 8080:8080 imgshape:4.0.0

# Cloud Run
gcloud run deploy imgshape --image gcr.io/your-project/imgshape:4.0.0

๐Ÿค Contributing

We welcome contributions! Please:

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/amazing-feature
  3. Make your changes and add tests
  4. Run pytest tests/ to verify
  5. Commit: git commit -m 'Add amazing feature'
  6. Push: git push origin feature/amazing-feature
  7. Open a Pull Request

See CONTRIBUTING.md for detailed guidelines.


๐Ÿ“š Additional Resources


๐Ÿ“„ License

imgshape is released under the MIT License. See LICENSE file for details.


Built with ๐Ÿ’œ by Stifler

Making dataset intelligence accessible to everyone.

If you find imgshape useful, please consider:

  • โญ Starring this repository
  • ๐Ÿ“ข Sharing with your colleagues
  • ๐Ÿ› Reporting issues and suggesting features
  • ๐Ÿค Contributing code or documentation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

imgshape-4.0.0.tar.gz (70.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

imgshape-4.0.0-py3-none-any.whl (69.1 kB view details)

Uploaded Python 3

File details

Details for the file imgshape-4.0.0.tar.gz.

File metadata

  • Download URL: imgshape-4.0.0.tar.gz
  • Upload date:
  • Size: 70.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.4

File hashes

Hashes for imgshape-4.0.0.tar.gz
Algorithm Hash digest
SHA256 bb39ca79d55a03ae12a18b53923b8033daa13e3f34afa2eb3685b5cf9196b5d6
MD5 ed28055df90ecde9ee4463f8923646c0
BLAKE2b-256 ca03a52c583c16dd22c5d36f8d4ec8fb856e71a3c9b38c4feb31a7b5b3a0e6c8

See more details on using hashes here.

File details

Details for the file imgshape-4.0.0-py3-none-any.whl.

File metadata

  • Download URL: imgshape-4.0.0-py3-none-any.whl
  • Upload date:
  • Size: 69.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.4

File hashes

Hashes for imgshape-4.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f4f3a976fcd9e4b3c610b222e0a8a0ded8023a04a10042960e0d0b90ff964588
MD5 c8895e68b07493a90e83b68fe581128e
BLAKE2b-256 b08f731c62083645a9ba95e2736ced74066fd0e785dd8bb9b3de33a6e93887cd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page