A library for modular, fast, and reproducible ML experimentation built for R&D.
Project description
ModularML
Modular, fast, and reproducible ML experimentation built for R&D.
Why ModularML?
ModularML is a flexible, backend-agnostic machine learning framework for designing, training, and evaluating modular ML pipelines, tailored specifically for research and scientific workflows. It enables rapid experimentation with complex model architectures, supports domain-specific feature engineering, and provides full reproducibility through configuration-driven declaration.
ModularML provides a plug-and-play ecosystem of interoperable components for data preprocessing, sampling, modeling, training, and evaluation — all wrapped in a unified experiment container.
Features
ModularML includes a comprehensive set of components for scientific ML workflows:
Data Handling
FeatureSetabstraction for organizing structured datasets with features, targets, tags, and metadata.Dataclass with unified support for multiple backends (torch.Tensor,tf.Tensor,np.ndarray).- Built-in splitters: Supports sample-based and rule-based splitting with condition-based filtering by feature, target, or tags values.
- Sample grouping and multi-part splits for paired, triplet, or grouped training tasks.
Advanced Sampling
- Flexible
FeatureSamplerinterface with support for advanced sampling during different stages of model training, including:- Triplet sampling (e.g., anchor/positive/negative)
- Paired samples
- Class-balanced, cluster-based, or time-windowed sampling strategies.
- Condition-aware sampling using any tags or metadata fields.
Model Architecture
ModelGraph: A Directed Acyclic Graph (DAG)-based model builder where:- Each node is a
ModelStage(e.g., encoder, head, discriminator). - Each stage can use a different backend (PyTorch, TensorFlow, scikit-learn, LightGBM, etc).
- Mixed-backend models are supported with seamless input/output routing.
- Each node is a
- Stage-wise training: Custom
TrainingStageconfiguration enables fine-tuning, freezing, and transfer learning across sub-models.
Training & Experiments
Experimentclass encapsulates all training logic (via multipleTrainingStageobjects), ModelGraph and FeatureSet definition, and aTrackingManagerthat logs all configuration files and training, validation, and evaluation metrics for rapid and reproducible ML experimentation.- Each
TrainingStagedefines training loop logic with early stopping, validation hooks, loss weighting, and optimizer configs. - Multi-objective loss support with configurable stage-level targets, sample-based loss functions, and weighted combinations.
- Config-driven experiments: Every experiment is fully seriallizable and reproducible from a single configuration file.
- Built-in experiment tracking via a
TrackingManager, with optional integration into external managers like MLflow or other logging backends.
Getting Started
Requirements
- Python >= 3.8
- PyTorch >= 1.10
- TensorFlow >= 2.8
- NumPy >= 1.22
Installation
Install from PyPI:
pip install modularml
To install the latest development version:
pip install git+https://github.com/REIL-UConn/modular-ml.git
Below outlines a basic toy example of using ModularML:
# EXAMPLE COMING SOON ...
Explore More
- Examples (coming soon) – Explore complete examples of how to set up FeatureSets, samplers, model graphs, and training configurations.
- Documentation (coming soon) – API reference, component explanations, YAML configuration guides, and tutorials.
- Discussions (coming soon) – Join the community, ask questions, suggest features, or share use cases.
License
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file modularml-0.1.0.tar.gz.
File metadata
- Download URL: modularml-0.1.0.tar.gz
- Upload date:
- Size: 259.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4b05311d8f38cc0158ad5321a59273f8439d2ca8bbfeb14eea33f5c5aba35000
|
|
| MD5 |
112b9c96510ca37f3d90460efc0c8d2b
|
|
| BLAKE2b-256 |
24dbabf037cd3e0368aa61679832838fcc4c124f9c566bf45dae1eadec616b87
|
File details
Details for the file modularml-0.1.0-py3-none-any.whl.
File metadata
- Download URL: modularml-0.1.0-py3-none-any.whl
- Upload date:
- Size: 59.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2b740e00e76eb753b573f659fd8916898736840471867172a3c2a2253626d982
|
|
| MD5 |
ba1b506299bac2436985a5e4771a5dab
|
|
| BLAKE2b-256 |
375e81671bfd6611232ffbb1d24987f3221d3f6836f2f976b10e8cb51f7be11a
|