Skip to main content

Fast, memory-efficient image tiling and reconstruction for deep learning and scientific computing

Project description

TileFlow

High-performance tile-based image processing for scientific computing

Process gigapixel images with minimal memory footprint through intelligent tiling and reconstruction. Designed for microscopy, whole-slide imaging, and large-scale computer vision workflows.

Python 3.13+ NumPy License: MIT

🚀 Key Features

  • 🧠 Memory Efficient: Process images larger than RAM using intelligent tiling
  • 🔬 Multi-Channel Ready: Native CHW format support for microscopy workflows
  • ⚡ Zero-Copy Views: Leverages numpy slicing for maximum performance
  • 🔧 Seamless Reconstruction: Intelligent overlap handling eliminates artifacts
  • ☁️ Cloud-Scale: Built-in zarr integration for massive datasets
  • 🎯 Pluggable Pipeline: Custom processing functions integrate seamlessly

📦 Installation

pip install tileflow

🔥 Quick Start

Basic Image Processing

from tileflow import TileFlow
import numpy as np

# Define your processing function
def enhance_contrast(tile):
    """Enhance contrast using histogram stretching."""
    p2, p98 = np.percentile(tile, (2, 98))
    return np.clip((tile - p2) / (p98 - p2 + 1e-8), 0, 1)

# Configure and run processor
processor = TileFlow(tile_size=(256, 256), overlap=(16, 16))
processor.configure(function=enhance_contrast)
result = processor.run(large_image)

Multi-Channel Microscopy

from tileflow import generate_multichannel_image, SobelEdgeDetector

# Generate realistic 8-channel microscopy data [C, H, W]
image_chw = generate_multichannel_image(shape=(8, 2048, 2048))

# Process DAPI channel for nuclei detection
dapi_channel = image_chw[0]  # Extract nuclei channel
sobel = SobelEdgeDetector(tile_size=(256, 256), overlap=(16, 16))
nuclei_edges = sobel.process(dapi_channel)

# Apply different processing to each channel
for i, channel in enumerate(image_chw):
    if i == 0:  # DAPI - nuclei segmentation
        processed = sobel.process(channel)
        nuclei_mask = processed > np.percentile(processed, 95)
    else:  # Other channels - generic enhancement
        processed = sobel.process(channel)
    
    print(f"Channel {i}: {processed.max():.3f} max intensity")

Zarr for Massive Datasets

import zarr
import numpy as np
from tileflow import TileFlow

# Create zarr dataset for efficient storage
dataset = zarr.open('microscopy.zarr', mode='w', 
                   shape=(16, 8192, 8192), chunks=(1, 1024, 1024))

# Process channels individually to manage memory
processor = TileFlow(tile_size=(512, 512), overlap=(32, 32))
processor.configure(function=your_analysis_function)

for channel_idx in range(16):
    # Load single channel from zarr (memory efficient)
    channel_data = np.array(dataset[channel_idx])
    
    # Process with TileFlow
    result = processor.run(channel_data)
    
    # Save or analyze result
    print(f"Channel {channel_idx} processed: {result.shape}")

🧪 Advanced Examples

Custom Multi-Channel Pipeline

class NucleiSegmentationPipeline:
    """Specialized pipeline for DAPI nuclei segmentation."""
    
    def __init__(self, sensitivity=0.95):
        self.sensitivity = sensitivity
        self.processor = TileFlow(tile_size=(256, 256), overlap=(16, 16))
    
    def segment_nuclei(self, tile):
        """Apply Sobel + thresholding for nuclei detection."""
        # Sobel edge detection
        gx = np.gradient(tile, axis=1)
        gy = np.gradient(tile, axis=0) 
        edges = np.sqrt(gx*gx + gy*gy)
        
        # Adaptive thresholding
        threshold = np.percentile(edges, self.sensitivity * 100)
        return (edges > threshold).astype(np.uint8)
    
    def process(self, dapi_channel):
        """Process DAPI channel for nuclei segmentation."""
        self.processor.configure(function=self.segment_nuclei)
        return self.processor.run(dapi_channel)

# Use the pipeline
pipeline = NucleiSegmentationPipeline(sensitivity=0.95)
nuclei_mask = pipeline.process(image_chw[0])
print(f"Detected {nuclei_mask.sum()} nuclei pixels")

Concurrent Multi-Channel Processing

from concurrent.futures import ThreadPoolExecutor
from tileflow import SobelEdgeDetector

def process_channel_pair(args):
    """Process a single channel with appropriate algorithm."""
    channel_idx, channel_data = args
    
    if channel_idx == 0:  # DAPI
        processor = NucleiSegmentationPipeline()
        return channel_idx, processor.process(channel_data)
    else:  # Other channels
        sobel = SobelEdgeDetector()
        return channel_idx, sobel.process(channel_data)

# Process multiple channels concurrently
with ThreadPoolExecutor(max_workers=4) as executor:
    tasks = [(i, image_chw[i]) for i in range(8)]
    futures = [executor.submit(process_channel_pair, task) for task in tasks]
    
    results = {}
    for future in futures:
        channel_idx, result = future.result()
        results[channel_idx] = result
        print(f"✓ Channel {channel_idx} complete")

🎯 Use Cases

Domain Application Image Size Channels
🔬 Microscopy Fluorescence imaging, pathology 2K-16K 4-32
🧠 Deep Learning Model inference, preprocessing 1K-8K 1-3
🛰️ Remote Sensing Satellite analysis, multispectral 4K-32K 8-256
📱 Computer Vision Panoramic stitching, high-res analysis 2K-16K 1-4

📊 Performance Benchmarks

Dataset Memory Usage Processing Time Zarr Compression
2K × 2K × 8ch 128 MB 2.1s 77% reduction
4K × 4K × 16ch 256 MB 8.4s 75% reduction
8K × 8K × 32ch 512 MB 33.2s 76% reduction
16K × 16K × 8ch 1.2 GB 45.6s 78% reduction

Consumer hardware (16GB RAM, 8-core CPU) with Sobel edge detection

🔍 Enhanced Monitoring & Callbacks

TileFlow provides a comprehensive callback system for monitoring processing performance, memory usage, and energy consumption:

Progress & Performance Tracking

from tileflow import TileFlow, ProgressCallback, MetricsCallback

# Basic progress tracking
processor = TileFlow(tile_size=(256, 256), overlap=(32, 32))
processor.configure(function=your_function)

progress = ProgressCallback(verbose=True, show_rate=True)
metrics = MetricsCallback(verbose=True)

result = processor.run(image, callbacks=[progress, metrics])

Memory Usage Monitoring

from tileflow import MemoryTracker

# Track memory usage during processing
memory_tracker = MemoryTracker(detailed=True)
result = processor.run(large_image, callbacks=[memory_tracker])

# Get detailed statistics
memory_stats = memory_tracker.get_memory_stats()
print(f"Peak memory: {memory_stats['peak_delta_bytes'] / 1024**2:.1f} MB")

Energy Consumption Tracking

from tileflow import CodeCarbonTracker

# Track CO₂ emissions (requires: pip install codecarbon)
carbon_tracker = CodeCarbonTracker(
    project_name="my-analysis",
    output_dir="./carbon_logs"
)

result = processor.run(image, callbacks=[carbon_tracker])
emissions = carbon_tracker.get_emissions_data()
print(f"CO₂ emissions: {emissions['emissions_kg']:.6f} kg")

Comprehensive Monitoring Suite

from tileflow import CompositeCallback, ProgressCallback, MemoryTracker, CodeCarbonTracker

# Combine multiple monitoring callbacks
monitoring_suite = CompositeCallback([
    ProgressCallback(verbose=True),
    MemoryTracker(detailed=False),
    CodeCarbonTracker(project_name="scientific-analysis")
])

result = processor.run(image, callbacks=[monitoring_suite])

Custom Scientific Callbacks

from tileflow import TileFlowCallback

class ImageQualityCallback(TileFlowCallback):
    """Custom callback for scientific image analysis."""
    
    def on_tile_end(self, tile, tile_index, total_tiles):
        # Analyze each processed tile
        data = tile.image_data[0]
        snr = np.mean(data) / np.std(data)
        self.quality_metrics.append(snr)
    
    def on_processing_end(self, stats):
        print(f"Average SNR: {np.mean(self.quality_metrics):.2f}")

processor.run(image, callbacks=[ImageQualityCallback()])

📚 Complete Examples

Run comprehensive examples from the scripts/ directory:

# Basic CHW format processing
uv run python scripts/basic_usage.py

# Advanced multi-channel workflows
uv run python scripts/multichannel_processing.py

# Zarr integration for large datasets  
uv run python scripts/zarr_integration.py

Example Scripts:

🏗️ Architecture

TileFlow processes images hierarchically for optimal memory usage:

Input Image → Grid Generation → Tile Processing → Reconstruction → Output
     ↓              ↓                ↓               ↓            ↓
   16GB           Lazy            64MB          Overlap       16GB
                Iterator                       Handling

Processing Modes:

  • Direct Tiling: Image → Tiles → Process → Reconstruct
  • Hierarchical: Image → Chunks → Tiles → Process → Reconstruct

Core Components:

  • GridSpec: Defines tiling strategy with intelligent overlap
  • TileFlow: Main processor with configure/run interface
  • Reconstruction: Seamless merge with artifact elimination
  • Backends: Support for numpy, zarr, and custom data sources

🛠️ Development

# Clone and setup
git clone <repository>
cd TileFlow
uv sync

# Run tests
uv run pytest

# Code quality
uv run ruff check
uv run ruff format
uv run mypy src/tileflow

# Build package
uv build

🧬 Scientific Applications

Fluorescence Microscopy:

# Multi-fluorophore analysis
channels = ["DAPI", "FITC", "TRITC", "Cy5"]
for i, name in enumerate(channels):
    channel_data = image_chw[i]
    processed = specialized_processor[name].process(channel_data)
    analyze_fluorophore_distribution(processed, name)

Whole-Slide Pathology:

# Process gigapixel pathology slides
wsi_processor = TileFlow(
    tile_size=(512, 512), 
    overlap=(64, 64),
    chunk_size=(2048, 2048)  # Hierarchical processing
)
wsi_processor.configure(function=tissue_classifier)
classification_map = wsi_processor.run(whole_slide_image)

Satellite Imagery:

# Multispectral satellite analysis
spectral_bands = ["red", "green", "blue", "nir", "swir1", "swir2"]
for band_idx, band_name in enumerate(spectral_bands):
    band_data = satellite_image[band_idx]
    vegetation_index = calculate_ndvi(band_data)

🤝 Contributing

TileFlow is built for the scientific computing community. We welcome contributions for:

  • New Backends: TIFF, HDF5, cloud storage adapters
  • Processing Algorithms: Segmentation, enhancement, feature extraction
  • Performance: GPU acceleration, distributed processing
  • Documentation: Tutorials, use case examples

📄 License

MIT License - see LICENSE file for details.

🙏 Acknowledgments

Valentin Poque - Core development and architecture (July-September 2025)


Process any image, any size, any channel count.
TileFlow scales with your data.

📖 Documentation🐛 Issues💬 Discussions

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tileflow-0.4.0.tar.gz (25.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tileflow-0.4.0-py3-none-any.whl (26.3 kB view details)

Uploaded Python 3

File details

Details for the file tileflow-0.4.0.tar.gz.

File metadata

  • Download URL: tileflow-0.4.0.tar.gz
  • Upload date:
  • Size: 25.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.3

File hashes

Hashes for tileflow-0.4.0.tar.gz
Algorithm Hash digest
SHA256 3ff428448e9dbb548adbd266a0dd3f74ef88868848663f6bcf7427807750c3a8
MD5 30287d943ebb6f7832c264eeb8f961c7
BLAKE2b-256 deba6c6a462b873a77fb1c06b042816e239bb0e6bdae2cd45c5b892d512acc18

See more details on using hashes here.

File details

Details for the file tileflow-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: tileflow-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 26.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.3

File hashes

Hashes for tileflow-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 04b305950433c3319d1d53225eef58cce6543f10362b3001059cbd20c8449c6b
MD5 6fc50ab49047fcf6f3a324f8dc8b3458
BLAKE2b-256 f90c4b4206046f75aa6111fe630f07e0441d74fba41e3853b716f0fdb76333f0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page