Skip to main content

Fast, memory-efficient image tiling and reconstruction for deep learning and scientific computing

Project description

TileFlow

High-performance tile-based image processing for scientific computing

Process gigapixel images with minimal memory footprint through intelligent tiling and reconstruction. Designed for microscopy, whole-slide imaging, and large-scale computer vision workflows.

Python 3.13+ NumPy License: MIT

🚀 Key Features

  • 🧠 Memory Efficient: Process images larger than RAM using intelligent tiling
  • 🔬 Multi-Channel Ready: Native CHW format support for microscopy workflows
  • ⚡ Zero-Copy Views: Leverages numpy slicing for maximum performance
  • 🔧 Seamless Reconstruction: Intelligent overlap handling eliminates artifacts
  • ☁️ Cloud-Scale: Built-in zarr integration for massive datasets
  • 🎯 Pluggable Pipeline: Custom processing functions integrate seamlessly

📦 Installation

pip install tileflow

🔥 Quick Start

Basic Image Processing

from tileflow import TileFlow
import numpy as np

# Define your processing function
def enhance_contrast(tile):
    """Enhance contrast using histogram stretching."""
    p2, p98 = np.percentile(tile, (2, 98))
    return np.clip((tile - p2) / (p98 - p2 + 1e-8), 0, 1)

# Configure and run processor
processor = TileFlow(tile_size=(256, 256), overlap=(16, 16))
processor.configure(function=enhance_contrast)
result = processor.run(large_image)

Multi-Channel Microscopy

from tileflow import generate_multichannel_image, SobelEdgeDetector

# Generate realistic 8-channel microscopy data [C, H, W]
image_chw = generate_multichannel_image(shape=(8, 2048, 2048))

# Process DAPI channel for nuclei detection
dapi_channel = image_chw[0]  # Extract nuclei channel
sobel = SobelEdgeDetector(tile_size=(256, 256), overlap=(16, 16))
nuclei_edges = sobel.process(dapi_channel)

# Apply different processing to each channel
for i, channel in enumerate(image_chw):
    if i == 0:  # DAPI - nuclei segmentation
        processed = sobel.process(channel)
        nuclei_mask = processed > np.percentile(processed, 95)
    else:  # Other channels - generic enhancement
        processed = sobel.process(channel)
    
    print(f"Channel {i}: {processed.max():.3f} max intensity")

Zarr for Massive Datasets

import zarr
import numpy as np
from tileflow import TileFlow

# Create zarr dataset for efficient storage
dataset = zarr.open('microscopy.zarr', mode='w', 
                   shape=(16, 8192, 8192), chunks=(1, 1024, 1024))

# Process channels individually to manage memory
processor = TileFlow(tile_size=(512, 512), overlap=(32, 32))
processor.configure(function=your_analysis_function)

for channel_idx in range(16):
    # Load single channel from zarr (memory efficient)
    channel_data = np.array(dataset[channel_idx])
    
    # Process with TileFlow
    result = processor.run(channel_data)
    
    # Save or analyze result
    print(f"Channel {channel_idx} processed: {result.shape}")

🧪 Advanced Examples

Custom Multi-Channel Pipeline

class NucleiSegmentationPipeline:
    """Specialized pipeline for DAPI nuclei segmentation."""
    
    def __init__(self, sensitivity=0.95):
        self.sensitivity = sensitivity
        self.processor = TileFlow(tile_size=(256, 256), overlap=(16, 16))
    
    def segment_nuclei(self, tile):
        """Apply Sobel + thresholding for nuclei detection."""
        # Sobel edge detection
        gx = np.gradient(tile, axis=1)
        gy = np.gradient(tile, axis=0) 
        edges = np.sqrt(gx*gx + gy*gy)
        
        # Adaptive thresholding
        threshold = np.percentile(edges, self.sensitivity * 100)
        return (edges > threshold).astype(np.uint8)
    
    def process(self, dapi_channel):
        """Process DAPI channel for nuclei segmentation."""
        self.processor.configure(function=self.segment_nuclei)
        return self.processor.run(dapi_channel)

# Use the pipeline
pipeline = NucleiSegmentationPipeline(sensitivity=0.95)
nuclei_mask = pipeline.process(image_chw[0])
print(f"Detected {nuclei_mask.sum()} nuclei pixels")

Concurrent Multi-Channel Processing

from concurrent.futures import ThreadPoolExecutor
from tileflow import SobelEdgeDetector

def process_channel_pair(args):
    """Process a single channel with appropriate algorithm."""
    channel_idx, channel_data = args
    
    if channel_idx == 0:  # DAPI
        processor = NucleiSegmentationPipeline()
        return channel_idx, processor.process(channel_data)
    else:  # Other channels
        sobel = SobelEdgeDetector()
        return channel_idx, sobel.process(channel_data)

# Process multiple channels concurrently
with ThreadPoolExecutor(max_workers=4) as executor:
    tasks = [(i, image_chw[i]) for i in range(8)]
    futures = [executor.submit(process_channel_pair, task) for task in tasks]
    
    results = {}
    for future in futures:
        channel_idx, result = future.result()
        results[channel_idx] = result
        print(f"✓ Channel {channel_idx} complete")

🎯 Use Cases

Domain Application Image Size Channels
🔬 Microscopy Fluorescence imaging, pathology 2K-16K 4-32
🧠 Deep Learning Model inference, preprocessing 1K-8K 1-3
🛰️ Remote Sensing Satellite analysis, multispectral 4K-32K 8-256
📱 Computer Vision Panoramic stitching, high-res analysis 2K-16K 1-4

📊 Performance Benchmarks

Dataset Memory Usage Processing Time Zarr Compression
2K × 2K × 8ch 128 MB 2.1s 77% reduction
4K × 4K × 16ch 256 MB 8.4s 75% reduction
8K × 8K × 32ch 512 MB 33.2s 76% reduction
16K × 16K × 8ch 1.2 GB 45.6s 78% reduction

Consumer hardware (16GB RAM, 8-core CPU) with Sobel edge detection

🔍 Enhanced Monitoring & Callbacks

TileFlow provides a comprehensive callback system for monitoring processing performance, memory usage, and energy consumption:

Progress & Performance Tracking

from tileflow import TileFlow, ProgressCallback, MetricsCallback

# Basic progress tracking
processor = TileFlow(tile_size=(256, 256), overlap=(32, 32))
processor.configure(function=your_function)

progress = ProgressCallback(verbose=True, show_rate=True)
metrics = MetricsCallback(verbose=True)

result = processor.run(image, callbacks=[progress, metrics])

Memory Usage Monitoring

from tileflow import MemoryTracker

# Track memory usage during processing
memory_tracker = MemoryTracker(detailed=True)
result = processor.run(large_image, callbacks=[memory_tracker])

# Get detailed statistics
memory_stats = memory_tracker.get_memory_stats()
print(f"Peak memory: {memory_stats['peak_delta_bytes'] / 1024**2:.1f} MB")

Energy Consumption Tracking

from tileflow import CodeCarbonTracker

# Track CO₂ emissions (requires: pip install codecarbon)
carbon_tracker = CodeCarbonTracker(
    project_name="my-analysis",
    output_dir="./carbon_logs"
)

result = processor.run(image, callbacks=[carbon_tracker])
emissions = carbon_tracker.get_emissions_data()
print(f"CO₂ emissions: {emissions['emissions_kg']:.6f} kg")

Comprehensive Monitoring Suite

from tileflow import CompositeCallback, ProgressCallback, MemoryTracker, CodeCarbonTracker

# Combine multiple monitoring callbacks
monitoring_suite = CompositeCallback([
    ProgressCallback(verbose=True),
    MemoryTracker(detailed=False),
    CodeCarbonTracker(project_name="scientific-analysis")
])

result = processor.run(image, callbacks=[monitoring_suite])

Custom Scientific Callbacks

from tileflow import TileFlowCallback

class ImageQualityCallback(TileFlowCallback):
    """Custom callback for scientific image analysis."""
    
    def on_tile_end(self, tile, tile_index, total_tiles):
        # Analyze each processed tile
        data = tile.image_data[0]
        snr = np.mean(data) / np.std(data)
        self.quality_metrics.append(snr)
    
    def on_processing_end(self, stats):
        print(f"Average SNR: {np.mean(self.quality_metrics):.2f}")

processor.run(image, callbacks=[ImageQualityCallback()])

📚 Complete Examples

Run comprehensive examples from the scripts/ directory:

# Basic CHW format processing
uv run python scripts/basic_usage.py

# Advanced multi-channel workflows
uv run python scripts/multichannel_processing.py

# Zarr integration for large datasets  
uv run python scripts/zarr_integration.py

Example Scripts:

🏗️ Architecture

TileFlow processes images hierarchically for optimal memory usage:

Input Image → Grid Generation → Tile Processing → Reconstruction → Output
     ↓              ↓                ↓               ↓            ↓
   16GB           Lazy            64MB          Overlap       16GB
                Iterator                       Handling

Processing Modes:

  • Direct Tiling: Image → Tiles → Process → Reconstruct
  • Hierarchical: Image → Chunks → Tiles → Process → Reconstruct

Core Components:

  • GridSpec: Defines tiling strategy with intelligent overlap
  • TileFlow: Main processor with configure/run interface
  • Reconstruction: Seamless merge with artifact elimination
  • Backends: Support for numpy, zarr, and custom data sources

🛠️ Development

# Clone and setup
git clone <repository>
cd TileFlow
uv sync

# Run tests
uv run pytest

# Code quality
uv run ruff check
uv run ruff format
uv run mypy src/tileflow

# Build package
uv build

🧬 Scientific Applications

Fluorescence Microscopy:

# Multi-fluorophore analysis
channels = ["DAPI", "FITC", "TRITC", "Cy5"]
for i, name in enumerate(channels):
    channel_data = image_chw[i]
    processed = specialized_processor[name].process(channel_data)
    analyze_fluorophore_distribution(processed, name)

Whole-Slide Pathology:

# Process gigapixel pathology slides
wsi_processor = TileFlow(
    tile_size=(512, 512), 
    overlap=(64, 64),
    chunk_size=(2048, 2048)  # Hierarchical processing
)
wsi_processor.configure(function=tissue_classifier)
classification_map = wsi_processor.run(whole_slide_image)

Satellite Imagery:

# Multispectral satellite analysis
spectral_bands = ["red", "green", "blue", "nir", "swir1", "swir2"]
for band_idx, band_name in enumerate(spectral_bands):
    band_data = satellite_image[band_idx]
    vegetation_index = calculate_ndvi(band_data)

🤝 Contributing

TileFlow is built for the scientific computing community. We welcome contributions for:

  • New Backends: TIFF, HDF5, cloud storage adapters
  • Processing Algorithms: Segmentation, enhancement, feature extraction
  • Performance: GPU acceleration, distributed processing
  • Documentation: Tutorials, use case examples

📄 License

MIT License - see LICENSE file for details.

🙏 Acknowledgments

Valentin Poque - Core development and architecture (July-September 2025)


Process any image, any size, any channel count.
TileFlow scales with your data.

📖 Documentation🐛 Issues💬 Discussions

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tileflow-0.3.0.tar.gz (25.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tileflow-0.3.0-py3-none-any.whl (26.2 kB view details)

Uploaded Python 3

File details

Details for the file tileflow-0.3.0.tar.gz.

File metadata

  • Download URL: tileflow-0.3.0.tar.gz
  • Upload date:
  • Size: 25.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.3

File hashes

Hashes for tileflow-0.3.0.tar.gz
Algorithm Hash digest
SHA256 de70b3965444cec2c74016f306986e8915ed683c004d7af7a4d23dfb9bd261af
MD5 15fec1c71fa863e0bc58bfb6df118461
BLAKE2b-256 85211b8e3cbc6ce4f3a95a73d66691d94df3ba16927083bc2ecf7401b9c23faa

See more details on using hashes here.

File details

Details for the file tileflow-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: tileflow-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 26.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.3

File hashes

Hashes for tileflow-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8333634845c05902de1ca4ac49802ae633cef56400dae82e323e2af8d41ef105
MD5 88eb1bd01a504bc013e60052306b5190
BLAKE2b-256 960909bd921a8f5901b945d32f49a326b1700925163c5062cda8fdd82e9a2c26

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page