Skip to main content

A library for modular, fast, and reproducible ML experimentation built for R&D.

Project description

ModularML Banner

Modular, fast, and reproducible ML experimentation built for R&D.

Python PyPI Docs License

ModularML is a flexible, backend-agnostic machine learning framework for designing, training, and evaluating modular ML pipelines, tailored specifically for research and scientific workflows. It enables rapid experimentation with complex model architectures, supports domain-specific feature engineering, and provides full reproducibility through configuration-driven declaration.

ModularML provides a plug-and-play ecosystem of interoperable components for data preprocessing, sampling, modeling, training, and evaluation — all wrapped in a unified experiment container.

ModularML Overview Diagram

Figure 1. Overview of the ModularML framework, highlighting the three core abstractions: feature set preprocessing and splitting, modular model graph construction, and staged training orchestration.

Features

ModularML includes a comprehensive set of components for scientific ML workflows:

Data Handling

  • FeatureSet abstraction for organizing structured datasets with features, targets, tags, and metadata.
  • Data class with unified support for multiple backends (torch.Tensor, tf.Tensor, np.ndarray).
  • Built-in splitters: Supports sample-based and rule-based splitting with condition-based filtering by feature, target, or tags values.
  • Sample grouping and multi-part splits for paired, triplet, or grouped training tasks.

Advanced Sampling

  • Flexible FeatureSampler interface with support for advanced sampling during different stages of model training, including:
    • Triplet sampling (e.g., anchor/positive/negative)
    • Paired samples
    • Class-balanced, cluster-based, or time-windowed sampling strategies.
  • Condition-aware sampling using any tags or metadata fields.

Model Architecture

  • ModelGraph: A Directed Acyclic Graph (DAG)-based model builder where:
    • Each node is a ModelStage (e.g., encoder, head, discriminator).
    • Each stage can use a different backend (PyTorch, TensorFlow, scikit-learn, LightGBM, etc).
    • Mixed-backend models are supported with seamless input/output routing.
  • Stage-wise training: Custom TrainingPhase configuration enables fine-tuning, freezing, and transfer learning across sub-models.

Training & Experiments

  • Experiment class encapsulates all training logic (via multiple TrainingPhase objects), ModelGraph and FeatureSet definition, and a TrackingManager that logs all configuration files and training, validation, and evaluation metrics for rapid and reproducible ML experimentation.
  • Each TrainingPhase defines training loop logic with early stopping, validation hooks, loss weighting, and optimizer configs.
  • Multi-objective loss support with configurable stage-level targets, sample-based loss functions, and weighted combinations.
  • Config-driven experiments: Every experiment is fully seriallizable and reproducible from a single configuration file.
  • Built-in experiment tracking via a TrackingManager, with optional integration into external managers like MLflow or other logging backends.

Getting Started

Requires Python >= 3.9

Installation

Install from PyPI:

pip install modularml

To install the latest development version:

pip install git+https://github.com/REIL-UConn/modular-ml.git

Explore More

  • Examples – Explore complete examples of how to set up FeatureSets, apply feature preprocessing, construct model graphs, and run training configurations.
  • Documentation – API reference, component explanations, configuration guides, and tutorials.
  • Discussions – Join the community, ask questions, suggest features, or share use cases.

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

modularml-0.1.2.tar.gz (1.8 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

modularml-0.1.2-py3-none-any.whl (73.1 kB view details)

Uploaded Python 3

File details

Details for the file modularml-0.1.2.tar.gz.

File metadata

  • Download URL: modularml-0.1.2.tar.gz
  • Upload date:
  • Size: 1.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.13

File hashes

Hashes for modularml-0.1.2.tar.gz
Algorithm Hash digest
SHA256 67c69cb369fba2fc490a7f0c2c18b62b2f61cd888897d87e54510450202aa9fd
MD5 53e1e2a273ac797a22963583d8e5ffc8
BLAKE2b-256 e7906e42b4475ee441b42e2afb5e723b1732d7032c7de5bf16c9b83b6195bf55

See more details on using hashes here.

File details

Details for the file modularml-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: modularml-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 73.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.13

File hashes

Hashes for modularml-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 11f0e7592e4e6161ac8e1dafc0b744d7f9e3f0cfc30cbb444d692c4c8adfdf2f
MD5 24935620de389840d8667019c50075a9
BLAKE2b-256 80fa8a15400c039ac721624161fe2f64fd593b7b283621d6a98d94a48d976967

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page