Modosaic: A Multimodal Mosaic for In-Context Learning
Project description
Modosaic
Modosaic is a multimodal image-dataset pipeline for generating, validating, and saving complementary modalities from a shared image source. It gives you:
- A unified dataset layer for local image folders and parquet datasets.
- A configurable generation pipeline for source images, captions, segmentation masks, depth maps, and surface-normal fields.
- Validators and quality-gate constraints that decide which generated modalities are saved.
- A CLI for default runs, fully configurable runs, and config-file driven runs.
- A clean Python API for composing custom modalities, validators, and postprocessors.
- Reproducible experiment folders with generated artifacts, validation JSON, and structured logs.
Installation
uv sync
or, from an activated environment:
pip install -e .
or, from PyPI:
pip install modosaic
Python 3.13 is required. CUDA is optional but strongly recommended for the heavier text, segmentation, depth, and normals models.
Nix dev shell
The repository includes a Nix flake for a CUDA-ready development shell. Before entering it, configure the Nix daemon to trust the binary caches used by the flake; this avoids long local builds for CUDA and community packages.
Add the following to /etc/nix/nix.conf:
experimental-features = nix-command flakes
trusted-users = root olal_gft_com
extra-substituters = https://cache.nixos.org https://nix-community.cachix.org https://cache.nixos-cuda.org
extra-trusted-public-keys = nix-community.cachix.org-1:mB9ZQ+4kTq9qUqM96H8P6oz+ZWHR+Hh3wlgYx9oSt1A= cache.nixos-cuda.org:74DUi4Ye579gUqzH4ziL9IyiJBlDpMRn9MBN8oNan9M=
Restart the Nix daemon after changing the file, then enter the shell:
nix develop
Hugging Face model access
If you use the SAM 3 segmentation model (sam3, backed by facebook/sam3),
run Modosaic with HF_TOKEN set to a Hugging Face token from an account that
has access to Meta's SAM 3 model:
export HF_TOKEN=hf_...
modosaic run --dataset local --root ./images --segmentation-model sam3
Do not commit tokens to the repository.
Quick Start
1. List supported modalities and models
modosaic models
When running directly from a checkout without installing the console script:
python -m modosaic.cli.cli models
2. Run the default pipeline on a local image folder
modosaic simple ./images --limit 10
This runs all default modalities in dependency-safe order:
image -> text -> segmentation -> depth -> normals
Artifacts are written under ./experiments/<timestamp>/.
3. Run a selected local-folder experiment
modosaic run \
--dataset local \
--root ./images \
--modality image \
--modality segmentation \
--modality depth \
--segmentation-model sam3 \
--depth-model depth-anything-v2-small \
--segmentation-mask-quality-min 0.70 \
--depth-segmentation-boundary-min 0.15 \
--limit 20 \
--experiment-root ./experiments \
--experiment-name local-seg-depth
Validators can be disabled with --no-validators. To run validators without
using them as save/discard gates, pass --no-constraints.
4. Run from a parquet dataset
modosaic run \
--dataset parquet \
--parquet-path ./data/imagenet-a \
--image-column image.bytes \
--metadata-column label \
--modality image \
--modality text \
--text-model qwen-2-2b \
--text-siglip-min 0.65 \
--limit 20
Parquet image columns can contain bytes, bytearray/memoryview values,
list[int], nested fields such as image.bytes, or paths relative to the
parquet file.
Full Pipeline From Config
Pipeline config files can be JSON, TOML, YAML, or YML. YAML requires PyYAML.
The config maps directly into modosaic.cli.config.RunConfig.
Example examples/config.yaml:
dataset:
type: local
root: ./images
recursive: true
extensions: [.jpg, .png]
modalities:
enabled: [image, segmentation, depth]
models:
segmentation: sam3
depth: depth-anything-v2-small
validators:
enabled: true
constraints:
enabled: true
segmentation_mask_quality_minimum: 0.70
segmentation_boundary_minimum: 0.20
depth_imagebind_minimum: 0.55
depth_segmentation_boundary_minimum: 0.15
segmentation_boundary_thickness: 1
segmentation_tolerance_radius: 2
segmentation_rgb_edge_quantile: 0.90
depth_boundary_thickness: 1
depth_tolerance_radius: 2
depth_edge_quantile: 0.90
run:
limit: 20
experiment_root: ./experiments
experiment_name: local-seg-depth
log_path: ./.logs
seed: 42
Run it with:
modosaic pipeline examples/config.yaml
Override selected config values at execution time:
modosaic pipeline examples/config.yaml --limit 5 --seed 123 --json
Python API
from pathlib import Path
from modosaic import ExperimentService, ImageDataset, LoggingService, Pipeline
from modosaic.depth.preconfigured_modality import build_preconfigured_depth_modality
from modosaic.image import build_preconfigured_image_modality
from modosaic.segmentation.preconfigured_modality import (
build_preconfigured_segmentation_modality,
)
from modosaic.services.seeding import SeedingService
LoggingService.setup_logging()
SeedingService.set_global_seed(42)
dataset = ImageDataset.from_local_folder(Path("images"))
pipeline = Pipeline(
dataset=dataset,
modalities=[
build_preconfigured_image_modality(),
build_preconfigured_segmentation_modality(),
build_preconfigured_depth_modality(),
],
experiment=ExperimentService(
root="experiments",
experiment_name="local-seg-depth",
),
)
results = pipeline.run(limit=10)
for result in results:
print(result.record.sample_id, result.artifact_paths)
Each ConfiguredModality owns a generator, validators, and a postprocessor.
Plain validators receive (record, generated). A ValidatorStep can also pass
generated outputs from earlier modalities as keyword dependencies.
from modosaic.core.validation_constraint import ValidationConstraint
from modosaic.core.validator_step import ValidatorStep
from modosaic.segmentation.validators.impl.mask_statistics import MaskStatsValidator
mask_quality_gate = ValidatorStep(
validator=MaskStatsValidator(),
constraint=ValidationConstraint(
minimum=0.75,
score_name="weighted_mask_quality",
score_fn=lambda stats: (
0.4 * stats.coverage_score
+ 0.3 * stats.distinctness_score
+ 0.3 * stats.fragmentation_score
),
),
)
A constrained modality is saved only when every configured constraint passes. Rejected modalities do not write generated artifacts or validation JSON, and later validators cannot use them as dependencies.
CLI Reference
modosaic simple ROOT [--limit N]
Run the default Modosaic pipeline on a local image folder.
modosaic run [OPTIONS]
Run with CLI-provided dataset, modality, model, validator, constraint, and experiment settings.
modosaic pipeline CONFIG_PATH [--limit N] [--seed SEED] [--json]
Run from a JSON, TOML, YAML, or YML config file.
modosaic models
Print valid modality and model names for CLI options and config files.
Concepts & Extensibility
- Dataset adapters ->
modosaic.providers: local folders and parquet data. - Generators ->
modosaic.<modality>.generators: model-backed modality generation. - Validators ->
modosaic.<modality>.validators: quality checks and cross-modality consistency checks. - Constraints ->
modosaic.core.validation_constraint: pass/fail gates over validator output. - Postprocessors ->
modosaic.<modality>.postprocessor: conversion from model output to saved experiment artifacts. - Services -> logging, seeding, image conversion, artifact persistence, boundaries, edges, and tolerances.
Add custom components by subclassing:
DatasetAdapter -> providers.adapters.adapter.DatasetAdapter
ModalityGenerator -> core.modality_generator.ModalityGenerator
ModalityValidator -> core.validator.ModalityValidator
ModalityPostprocessor -> core.postprocessor.ModalityPostprocessor
Modality -> core.modality.Modality
or by composing existing pieces with ConfiguredModality.
Experiment Outputs
ExperimentService writes every accepted artifact beneath the configured run
folder. Typical output includes:
experiments/<run>/
image/
text/
segmentation/
depth/
normals/
validations/
Validation files include the validator name, raw value, optional score, threshold, and pass/fail result.
Documentation
MkDocs pages live in docs/, and API pages are generated from Google-style
Python docstrings through mkdocstrings.
Public classes, functions, and methods should carry useful Google-style docstrings because they form the API reference. Module docstrings are optional for simple implementation modules; add them when a file exposes important package-level behavior, re-exports public symbols, or needs context that is not clear from the documented objects inside it.
uv run --group docs mkdocs serve
uv run --group docs mkdocs build --strict
Examples
See examples/main.py for a complete local demo that loads a parquet dataset, runs the
preconfigured modality stack, and prints validation summaries.
License
This project is licensed under the MIT License. See the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file modosaic-0.1.0.tar.gz.
File metadata
- Download URL: modosaic-0.1.0.tar.gz
- Upload date:
- Size: 72.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.16 {"installer":{"name":"uv","version":"0.11.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1e0e2b1f3b8b81c446c76b8acd1f4da6c25bd9bb066fc36d97cbcfda79f1289a
|
|
| MD5 |
e5e28b3121c1926017ba31a4047162d6
|
|
| BLAKE2b-256 |
d2a05ea0f38be7e74e92aebaf7df6ab7f404bcf871e37ef396cd39df16d8da72
|
File details
Details for the file modosaic-0.1.0-py3-none-any.whl.
File metadata
- Download URL: modosaic-0.1.0-py3-none-any.whl
- Upload date:
- Size: 108.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.16 {"installer":{"name":"uv","version":"0.11.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9b20a78f5be6b93c570f9850294fa66a6693cb57cfb47148fb0df2257395c001
|
|
| MD5 |
be3cf93022955138fae9b39f021c203d
|
|
| BLAKE2b-256 |
572dac4cd78e5447f6e7e62c55ef676998719fef48aa2cb800a53e6bc28f32e7
|