Batch image directory processing pipelines with reusable steps
Project description
flowimds
Flowimds delivers reusable image-processing pipelines for entire directories—compose steps and let the tool handle the batch work for you.
✨ Highlights
- ♻️ Batch processing at scale — Traverse entire directories with optional recursive scanning.
- 🗂️ Structure-aware outputs — Mirror the input folder layout when preserving directory structures.
- 🧩 Rich step library — Combine resizing, grayscale conversion, rotations, flips, binarisation, denoising, and custom steps.
- 🔄 Flexible execution modes — Operate on folders, explicit file lists, or in-memory NumPy arrays.
- 🧪 Deterministic fixtures — Recreate test data whenever needed for reproducible pipelines.
- 🤖 Expanding step roadmap — More transformations, including AI-assisted steps, are planned.
- 📁 Flattened outputs available — Optionally disable structure preservation to write everything into a single directory.
🚀 Quick start
All primary classes are re-exported from the package root, so pipelines can be described through a concise namespace:
# Import the flowimds package
import flowimds as fi
# Define the pipeline
# Args:
# steps: sequence of pipeline steps
# worker_count: number of parallel workers (default: ~70% of CPU cores)
# log: whether to show progress bar (default: False)
pipeline = fi.Pipeline(
steps=[
fi.ResizeStep((128, 128)),
fi.GrayscaleStep(),
],
)
# Run the pipeline
# Args:
# input_path: directory to scan for images
# recursive: whether to traverse subdirectories (default: False)
result = pipeline.run(input_path="samples/input", recursive=True)
# Save the results
# Args:
# output_path: destination directory
# preserve_structure: whether to mirror the input tree (default: False)
result.save("samples/output", preserve_structure=True)
# Inspect the result
# Fields:
# processed_count: number of successfully processed images
# failed_count: number of images that failed to process
# failed_files: paths of the images that failed
print(f"Processed {result.processed_count} images")
📦 Installation
- Python 3.12+
uvorpipfor dependency managementuvis recommended
uv
uv add flowimds
pip
pip install flowimds
From source
git clone https://github.com/mori-318/flowimds.git
cd flowimds
uv sync
📚 Documentation
- Usage guide — configuration tips and extended examples.
- 使用ガイド — 日本語版。
🔬 Benchmarks
Compare the legacy (v0.2.1-) and current (v1.0.2+) pipeline implementations with the bundled helper script. Running via uv keeps dependencies and the virtual environment consistent:
# count: number of synthetic images to generate (default `5000`)
# workers: maximum worker threads (`0` auto-detects CPU cores)
uv run python scripts/benchmark_pipeline.py --count 5000 --workers 8
--count: number of synthetic images to generate (default5000).--workers: maximum worker threads (0auto-detects CPU cores).--seed: specify the seed (default42) for reproducible comparisons.
The script prints processing times for each pipeline variant and cleans up temporary outputs afterward.
🆘 Support
Questions and bug reports are welcome via the GitHub issue tracker.
🤝 Contributing
We follow a GitFlow-based workflow to keep the library stable while enabling parallel development:
- main — release-ready code (tagged as
vX.Y.Z). - develop — staging area for the next release.
- feature/ — focused branches for scoped work.
- release/ — branches dedicated to preparing releases.
- hotfix/ — branches for urgent fixes.
- docs/ — branches for documentation updates.
For contribution flow details, see docs/CONTRIBUTING.md or the Japanese guide docs/CONTRIBUTING_ja.md.
🛠️ Development
# Install dependencies
uv sync --all-extras --dev
# Lint and format (apply fixes when needed)
uv run black .
uv run ruff format .
# Lint and format (verify)
uv run black --check .
uv run ruff check .
uv run ruff format --check .
# Regenerate deterministic fixtures when needed
uv run python scripts/generate_test_data.py
# Run tests
uv run pytest
Docker powered environment
You can standardize the development environment inside containers built from docker/Dockerfile. Dependencies are installed with uv sync --all-extras --dev during build, so any uv command (e.g., uv run pytest) is reproducible.
Two typical workflows exist:
-
Run the suite once in a disposable container (container exits when tests finish):
docker compose -f docker/docker-compose.yml up --build
-
Open an interactive shell for iterative work (recommended while developing):
# Build the image (no-op if cached) docker compose -f docker/docker-compose.yml build # Start an interactive container with a clean shell docker compose -f docker/docker-compose.yml run --rm app bash # Inside the container (already at /app) uv sync --all-extras --dev # install deps into the mounted .venv uv run pytest uv run black --check .
docker compose exec app ... works only while a container started with up is still running. Because the default command runs uv run pytest and exits immediately, use run --rm app bash whenever you need an interactive session.
Dev Container
A VS Code Dev Container configuration is provided under .devcontainer/. If you use the Dev Containers extension, you can open this repository in a container and work inside a reproducible Docker-based development environment.
Using with VS Code
-
Install and start Docker.
-
Install the "Dev Containers" extension in VS Code (if you do not have it yet).
-
Open this repository in VS Code and run "Dev Containers: Reopen in Container" from the command palette.
-
Inside the container, install dependencies and run the usual development commands:
uv sync --all-extras --dev uv run pytest uv run black --check .
📄 License
This project is released under the MIT License.
📌 Project status
Stable releases are already published on PyPI (v1.0.2), and we continue to iterate toward upcoming updates. Watch the repository for new tags and changelog announcements.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file flowimds-1.0.2.tar.gz.
File metadata
- Download URL: flowimds-1.0.2.tar.gz
- Upload date:
- Size: 26.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
61d29dddf376542699c401729409eca5c111f345fb3dfb78341a0738290fa77b
|
|
| MD5 |
5bf64056fb810ec8a96b0396ae3e1464
|
|
| BLAKE2b-256 |
a6dd63c5c5e0a72b37e4605da98828bb3509340868800efc1adcd4136c0c5e57
|
Provenance
The following attestation bundles were made for flowimds-1.0.2.tar.gz:
Publisher:
publish.yml on mori-318/flowimds
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
flowimds-1.0.2.tar.gz -
Subject digest:
61d29dddf376542699c401729409eca5c111f345fb3dfb78341a0738290fa77b - Sigstore transparency entry: 804655665
- Sigstore integration time:
-
Permalink:
mori-318/flowimds@bbd4c658f10a24bb7bd16f8762f2dde38a33529d -
Branch / Tag:
refs/tags/v1.0.2 - Owner: https://github.com/mori-318
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@bbd4c658f10a24bb7bd16f8762f2dde38a33529d -
Trigger Event:
push
-
Statement type:
File details
Details for the file flowimds-1.0.2-py3-none-any.whl.
File metadata
- Download URL: flowimds-1.0.2-py3-none-any.whl
- Upload date:
- Size: 23.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
03bfe5855d3f39fedeb01728b9a2bc618a985e97ebde90510cebe305109dc362
|
|
| MD5 |
d55f335b2f0af6117c0c0412d99b48ae
|
|
| BLAKE2b-256 |
7c95057084263e3d2d812cc7002cb9e3fae282034c3a270e8869ac42653070b7
|
Provenance
The following attestation bundles were made for flowimds-1.0.2-py3-none-any.whl:
Publisher:
publish.yml on mori-318/flowimds
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
flowimds-1.0.2-py3-none-any.whl -
Subject digest:
03bfe5855d3f39fedeb01728b9a2bc618a985e97ebde90510cebe305109dc362 - Sigstore transparency entry: 804655670
- Sigstore integration time:
-
Permalink:
mori-318/flowimds@bbd4c658f10a24bb7bd16f8762f2dde38a33529d -
Branch / Tag:
refs/tags/v1.0.2 - Owner: https://github.com/mori-318
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@bbd4c658f10a24bb7bd16f8762f2dde38a33529d -
Trigger Event:
push
-
Statement type: