Experiment workflow orchestration toolkit for PDEBench/Agentic4Sci — clearml + optuna + langfuse integration, dataset compliance, distributed experiment dispatch.
Project description
expflow-pde
Experiment workflow orchestration toolkit for PDEBench/Agentic4Sci.
Manage ML experiments across ClearML, Optuna, and Langfuse from a single CLI — training, HPO, distributed dispatch, compliance checks, and observability.
⚠️ Alpha: Core functionality works. APIs may change as we stabilize the feature set.
Quick Start
Install
# Core CLI (no external SDKs needed)
pip install expflow-pde
# With all SDK integrations
pip install "expflow-pde[all]"
# Individual extras
pip install "expflow-pde[clearml]" # Task/queue/dataset management
pip install "expflow-pde[optuna]" # Hyperparameter optimization
pip install "expflow-pde[langfuse]" # LLM observability traces
pip install "expflow-pde[mcp]" # MCP server + all SDKs
Verify
expflow version
expflow info
What expflow-pde Solves
Running PDEBench or Agentic4Sci experiments involves coordinating multiple tools:
| Problem | How expflow-pde Helps |
|---|---|
| Train → eval → submit loop | expflow pipeline submit — 3 modes (full/fast/skip) |
| Hyperparameter search | expflow optuna run — local, distributed, or clearml-native |
| Experiment tracking | expflow clearml tasks — list, enqueue, dequeue, compare |
| Competition compliance | expflow audit validate — PIN, metrics, rules, dataset lineage |
| LLM call observability | expflow langfuse trace — trace cost, session management |
| Multi-machine dispatch | expflow run submit — git worktree + clearml-agent queue |
| System monitoring | expflow system status — component health checks |
Non-Goals
- Not a general-purpose experiment manager (use ClearML directly for that)
- Not a PDE solver (use PDEBench / PhysicsNeMo)
- Not a replacement for your existing experiment tracking
CLI Overview
expflow
├── version / info ← Package info, system overview (no SDK deps)
├── init ← Interactive config wizard
├── clearml ← Task/queue/dataset CRUD [needs clearml extra]
├── optuna ← HPO study/trial/plot [needs optuna extra]
├── langfuse ← Trace/session/cost [needs langfuse extra]
├── run ← Local experiment submit/list/status/cancel
├── audit ← Validate, compare, compliance report
├── system ← Health checks, TensorBoard
├── pin ← PIN-protect destructive operations
├── analyze ← Task intelligence, equation registry, strategy
└── pipeline ← Train → eval → submit pipeline
Pipeline Modes
Full (HPO → Train → Eval)
expflow pipeline submit-full train_task1.py \
--queue default \
--trials 50 --parallel 4 \
--eval-script eval_task1.py \
--metric seg_total --direction maximize
Fast (Train → Eval, skip HPO)
expflow pipeline submit train_task1.py \
--queue default \
--train-param lr=0.001 --train-param epochs=80 \
--eval-script eval_task1.py
Flexible Skip
expflow pipeline submit-full train_task1.py --skip hpo --skip eval # train only
expflow pipeline submit-full train_task1.py --skip train --skip eval # HPO only
Hermes Agent Integration
expflow-pde ships with four Hermes Agent skills for AI-assisted experiment management.
Skills live in the repository's skills/ directory — install via URL:
# Install individual skills
hermes skills install https://raw.githubusercontent.com/diamond2nv/expflow/main/skills/expflow-pipeline-hpo/SKILL.md
hermes skills install https://raw.githubusercontent.com/diamond2nv/expflow/main/skills/experiment-lifecycle-governance/SKILL.md
hermes skills install https://raw.githubusercontent.com/diamond2nv/expflow/main/skills/clearml-metrics-logging-pattern/SKILL.md
hermes skills install https://raw.githubusercontent.com/diamond2nv/expflow/main/skills/competition-task-intelligence/SKILL.md
# Or tap the repo for easier access
hermes skills tap add diamond2nv/expflow
hermes skills install expflow-pipeline-hpo
This adds 4 agent skills:
| Skill | Purpose |
|---|---|
expflow-pipeline-hpo |
Competition pipeline orchestration (HPO → train → eval) |
experiment-lifecycle-governance |
PIN protection, metrics registry, compare-scores |
clearml-metrics-logging-pattern |
Standardized ClearML metric naming & reporting |
competition-task-intelligence |
PDE equation registry, task analysis, strategic advising |
MCP Server
expflow mcp # Start MCP server (stdio)
Register in ~/.hermes/config.yaml for agent access to all expflow tools:
mcp:
servers:
expflow:
command: "expflow"
args: ["mcp"]
After registration, the agent can: list tasks, enqueue experiments, compare scores, and more — directly from chat.
Agent Instructions
The AGENTS.md in the repo root provides self-contained agent instructions (project map, development commands, testing conventions, pitfalls) for any AI coding assistant reading the project.
Prerequisites
- Python 3.11+
- ClearML server (optional) — for distributed experiment dispatch
- Optuna (optional) — for hyperparameter optimization
- Langfuse server (optional) — for LLM trace observability
Configure via expflow init or by placing config.yaml / .env in your project root.
Optional Dependencies
# Minimal: pip install expflow-pde
pip install expflow-pde # core CLI only
# Extras:
pip install "expflow-pde[clearml]" # ClearML SDK integration
pip install "expflow-pde[optuna]" # Optuna HPO
pip install "expflow-pde[langfuse]" # Langfuse tracing
pip install "expflow-pde[pipeline]" # pipeline mode (needs clearml)
pip install "expflow-pde[mcp]" # MCP server (all above)
pip install "expflow-pde[all]" # everything
pip install "expflow-pde[dev]" # development tooling
Development
git clone https://github.com/diamond2nv/expflow.git
cd expflow
python -m venv venv && source venv/bin/activate
pip install -e ".[all,dev]"
ruff format . # Format code
ruff check --fix . # Lint + auto-fix
pyright . # Type check
python -m pytest tests/ -v # Run tests
python -m build # Build package
License
MIT
Links
- Full Usage Guide (English) | 中文使用指南
- Architecture
- Developer Guide
- Data Layer Design
- Competition Integration
- Hermes Agent Skills — 4 skills for MLOps experiment orchestration
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file expflow_pde-0.4.0-py3-none-any.whl.
File metadata
- Download URL: expflow_pde-0.4.0-py3-none-any.whl
- Upload date:
- Size: 112.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c2e9009e1c415ca165694a9bebd167336b29f6bc8477f9d682e36627a500b517
|
|
| MD5 |
ae314147169aab467f40c38fd36faf2e
|
|
| BLAKE2b-256 |
d0a20f7adfb57321e59182c94db7fa574254c903c50d906210af5265afff914e
|