Skip to main content

A library for modular, fast, and reproducible ML experimentation built for R&D.

Project description

ModularML

Modular, fast, and reproducible ML experimentation built for R&D.

Python PyPI Docs License


Why ModularML?

ModularML is a flexible, backend-agnostic machine learning framework for designing, training, and evaluating modular ML pipelines, tailored specifically for research and scientific workflows. It enables rapid experimentation with complex model architectures, supports domain-specific feature engineering, and provides full reproducibility through configuration-driven declaration.

ModularML provides a plug-and-play ecosystem of interoperable components for data preprocessing, sampling, modeling, training, and evaluation — all wrapped in a unified experiment container.

ModularML Overview Diagram

Figure 1. Overview of the ModularML framework, highlighting the three core abstractions: feature set preprocessing and splitting, modular model graph construction, and staged training orchestration.

Features

ModularML includes a comprehensive set of components for scientific ML workflows:

Data Handling

  • FeatureSet abstraction for organizing structured datasets with features, targets, tags, and metadata.
  • Data class with unified support for multiple backends (torch.Tensor, tf.Tensor, np.ndarray).
  • Built-in splitters: Supports sample-based and rule-based splitting with condition-based filtering by feature, target, or tags values.
  • Sample grouping and multi-part splits for paired, triplet, or grouped training tasks.

Advanced Sampling

  • Flexible FeatureSampler interface with support for advanced sampling during different stages of model training, including:
    • Triplet sampling (e.g., anchor/positive/negative)
    • Paired samples
    • Class-balanced, cluster-based, or time-windowed sampling strategies.
  • Condition-aware sampling using any tags or metadata fields.

Model Architecture

  • ModelGraph: A Directed Acyclic Graph (DAG)-based model builder where:
    • Each node is a ModelStage (e.g., encoder, head, discriminator).
    • Each stage can use a different backend (PyTorch, TensorFlow, scikit-learn, LightGBM, etc).
    • Mixed-backend models are supported with seamless input/output routing.
  • Stage-wise training: Custom TrainingStage configuration enables fine-tuning, freezing, and transfer learning across sub-models.

Training & Experiments

  • Experiment class encapsulates all training logic (via multiple TrainingStage objects), ModelGraph and FeatureSet definition, and a TrackingManager that logs all configuration files and training, validation, and evaluation metrics for rapid and reproducible ML experimentation.
  • Each TrainingStage defines training loop logic with early stopping, validation hooks, loss weighting, and optimizer configs.
  • Multi-objective loss support with configurable stage-level targets, sample-based loss functions, and weighted combinations.
  • Config-driven experiments: Every experiment is fully seriallizable and reproducible from a single configuration file.
  • Built-in experiment tracking via a TrackingManager, with optional integration into external managers like MLflow or other logging backends.

Getting Started

Requires Python >= 3.9

Installation

Install from PyPI:

pip install modularml

To install the latest development version:

pip install git+https://github.com/REIL-UConn/modular-ml.git

Explore More

  • Examples – Explore complete examples of how to set up FeatureSets, samplers, model graphs, and training configurations.
  • Documentation – API reference, component explanations, YAML configuration guides, and tutorials.
  • Discussions – Join the community, ask questions, suggest features, or share use cases.

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

modularml-0.1.1.tar.gz (9.9 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

modularml-0.1.1-py3-none-any.whl (69.8 kB view details)

Uploaded Python 3

File details

Details for the file modularml-0.1.1.tar.gz.

File metadata

  • Download URL: modularml-0.1.1.tar.gz
  • Upload date:
  • Size: 9.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.13

File hashes

Hashes for modularml-0.1.1.tar.gz
Algorithm Hash digest
SHA256 9d5888ec9ec123edf0302d717bf3e58f575d1ca16df562ff437b7a791194bf1e
MD5 67a1d77dc7cfe4c59d8f6a9c65d77a1d
BLAKE2b-256 5132de75e66c1f647543ae2d55edb0ed8a29a27e4c018e04933a6c84ac46f058

See more details on using hashes here.

File details

Details for the file modularml-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: modularml-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 69.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.13

File hashes

Hashes for modularml-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 cbc437eb4e227e81a841b973d248bb2d7c67bb98ff39d985dd5892558c684bfa
MD5 aec9b9436b08ddb5299f0dc6dd33a605
BLAKE2b-256 76d59c4be1bb8907138c1bbfa49fe7893e0c053c97af822d20b79592a3799be6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page