Agentic framework for HPC orchestration
Project description
fractale
Agentic state-machine orchestrator to support MCP Tools and Science
Design
We create a robust and asynchronous orchestrator for scientific workloads. It interacts with an mcp-server to register and load tools of interest. Tools of interest come from:
- flux-mcp: MCP tools for Flux Framework
- hpc-mcp: HPC tools for a larger set of HPC and converged computing use cases.
Abstractions
The library here has the following abstractions.
- Plan: a YAML, human generated manifest that any agent can read and deploy. (fractale/core/plan)
- Step: Units of work in a plan. A step can be of type:
- plan: An instruction to go to a manager to generate a larger plan
- agent: a sub-agent task that is a specialized expert at a specific function. Sub-agents call tools, prompts, and other sub-agents (and can recurse)
- tool: an explicit tool call that includes inputs and outputs.
- prompt: an explicit prompt endpoint that includes inputs, and an output prompt for an LLM
- Engine: The orchestration engine (native state machine) that instantiates agents (fractale/engines/native)
- Agents: are independent units of a state machine, a sub- or helper- agent that can be run under a primary orchestrating (state machine) agent. Optional agents that are exposed to the LLM as possible steps are under (fractale/agents), and agents that are essential to the top level orchestration are part of fractale/engines/native/agent.
Client
The client (high level) includes these calls:
# Run a specific, human generated plan
fractale run ./examples/plans/<plan.yaml>
# Run a specific sub-agent, first discovering and inspecting the environment to generate a more detailed prompt for it.
fractale agent optimize <Describe high level optimization task>
# Ask a random task (not necessarily an expert at anything, but access to all tools)
# This is a convenience wrapper to calling fractale agent ask_question
fractale prompt <General task to use server tools and prompts>
# Show all sub-agents available
fractale list
fractale list --json
Environment
The following variables can be set in the environment.
| Name | Description | Default |
|---|---|---|
FRACTALE_MCP_PORT |
Port MCP server is on, if using http variant | 8089 |
FRACTALE_MCP_TOKEN |
Token for server | unset |
FRACTALE_LLM_PROVIDER |
LLM Backend to use (gemini, openai) | gemini |
OPENAI_API_KEY |
API Key for an OpenAI model | unset |
OPENAI_BASE_URL |
Base url for OpenAI | unset |
GEMINI_API_KEY |
API key to use Google Gemini |
Note that for provider, you can also designate on the command line. The default is Gemini (gemini). To change:
fractale run --backend openai ./examples/plans/transform-retry.yaml
Agents
The fractale agent command provides means to request orchestration by a sub-agent. A sub-agent is an independent expert that is given access to all MCP endpoints and can orchestrate calls and user interaction to get to a desired outcome. Currently, sub-agent calls are passed through a manager that first inspects the environment to discover resources (compute, software, etc.) and come up with a scoped plan. This may change in the future.
Usage
Server
Let's install mcp-server to start a server with the functions we need.
pip install --break-system-packages git+https://github.com/converged-computing/mcp-server.git#egg=mcp-serve
Dependencies
To prototype with Flux, open the code in the devcontainer. Install the library and start a flux instance.
pip install -e .[all] --break-system-packages
pip install flux-mcp hpc-mcp[all] IPython --break-system-packages
flux start
Note that this needs to be run in an environment with Flux. I run both in the DevContainer. In a different terminal, export the same FLUX_URI from where your server is running. Ensure your credentials are exported.
export GEMINI_API_TOKEN=xxxxxxxxxx
Joke Example
Let's ask Gemini to tell us a joke. In one terminal:
mcpserver start --config ./examples/servers/run-job.yaml
fractale prompt Tell me a joke, and give me choices about the category.
Result Parser
Let's do an example where we add a one-off, on the fly tool, which is like a local registry. We can start an mcpserver in one terminal:
mcpserver start --config ./examples/servers/run-job.yaml
And then run fractale with our local tool defined (-r means registry to add):
fractale prompt -r ./examples/registry/parser-agents.yaml Write me a flux job that tells a joke, and then ask the result parser tool to derive a regular expression for the punchline.
LAMMPS Run
This requires LAMMPS (or similar) installed, and running a set of tools that include a database and the optimization agent.
mcpserver start --config ./examples/servers/run-job.yaml
fractale prompt -r ./examples/registry/analysis-agents.yaml Discover resources and run a LAMMPS job "lmp" with Flux using data in /code. Use the optimization agent step to optimize lammps, and keep retrying the run until the agent decides to stop.
Flux JobSpec Translation
We will need to start the server and add the validation functions and prompt. Start the server with the functions and prompt we need:
mcpserver start --config ./examples/servers/flux-gemini.yaml
And then:
fractale agent ./examples/plans/transform-retry.yaml
Spack Install and Run
git clone --depth 1 https://github.com/spack/spack /tmp/spack
export PATH=/tmp/spack/bin:$PATH
mcpserver start --config ./examples/servers/run-spack.yaml
fractale prompt Install cowsay with spack, load it, and use it to tell a joke.
Docker Build
Let's test doing a build. I'm running this on my local machine that has Docker, and I'm using Gemini.
export GEMINI_API_TOKEN=xxxxxxxxxx
Also install the functions from hpc-mcp:
pip install hpc-mcp --break-system-packages
pip install -e . --break-system-packages
Start the server with the functions and prompt we need:
mcpserver start --config ./examples/servers/docker-build.yaml
# In the other, run a plan explicitly, or do the same with a command line prompt
fractale run ./examples/plans/build-lammps.yaml
fractale prompt Build a container for lammps with an ubuntu 24.04 base
This works very well in Google Cloud (Gemini). I am not confident our on-premises models will easily choose the right tool.
Kubernetes
Note that I have a kind cluster running.
mcpserver start --config ./examples/servers/kubernetes-job.yaml
fractale prompt Deploy a basic hello world job to Kubernetes and get the output log.
TODO
- add saving of graph and transitions to state machine for research.
- where would we add algorithms here (as tools!)?
- get job logs / info needs better feedback for agent
License
HPCIC DevTools is distributed under the terms of the MIT license. All new contributions must be made under this license.
See LICENSE, COPYRIGHT, and NOTICE for details.
SPDX-License-Identifier: (MIT)
LLNL-CODE- 842614
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fractale-0.1.0.tar.gz.
File metadata
- Download URL: fractale-0.1.0.tar.gz
- Upload date:
- Size: 64.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
36109b6e10880c1ddce01412f92b869032f23f47561d266b41ec6b1bb02b7427
|
|
| MD5 |
b4e57e80895f35807e33f5f2376e351d
|
|
| BLAKE2b-256 |
363b03349f922641a0ce5f9efcc1f680f63a742143ee54e975def94d5e317ac1
|
File details
Details for the file fractale-0.1.0-py3-none-any.whl.
File metadata
- Download URL: fractale-0.1.0-py3-none-any.whl
- Upload date:
- Size: 76.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2832389836800a51d674160bd9829f0406acc8dbddb15d2ad56adeaf43635aea
|
|
| MD5 |
e1eb0042f8c7eec18c15abefb6caf844
|
|
| BLAKE2b-256 |
ba603a5b581404f0abe2b8a8f3bb23b72515275ddb185b5df8f2a2237adc8297
|