Agentic tree search engine for parallelized experiment orchestration
Project description
🌳 Arborist
Tree search for ML experiments. Define a goal. Arborist branches, evaluates, prunes, and converges. Like MCTS, but for hyperparameter tuning and feature engineering.
pip install arborist-ai
Why tree search?
Linear hyperparameter sweeps (grid, random, Bayesian) explore one path at a time. When they hit a local optimum, they're stuck.
Arborist treats your experiment space as a tree. It branches into multiple directions simultaneously, evaluates results, prunes dead ends, and doubles down on what works. The same way AlphaGo explores game states, but for ML experiments.
Results from our benchmarks:
| Dataset | Strategy | F1 Score | Experiments | Wall Time |
|---|---|---|---|---|
| Forest Cover Type | Linear + LLM | 0.8659 | 50 | 2454s |
| Forest Cover Type | Tree + LLM (UCB) | 0.8683 | 50 | 1793s |
| Forest Cover Type | Hybrid (explore + exploit) | 0.8750 | 50 | 1186s |
The hybrid strategy found a better solution 2x faster by exploring broadly with UCB, then hill-climbing from the best region.
Quickstart
from arborist import TreeSearch
search = TreeSearch(
goal="Find optimal x",
executor=lambda config: {"score": -(config["x"] - 3) ** 2 + 10},
score=lambda r: r["score"],
seed_configs=[{"x": 0}, {"x": 1}, {"x": 5}],
strategy="ucb",
max_experiments=50,
)
results = search.run()
print(f"Best: {results.best['score']:.4f}")
print(f"Config: {results.best['config']}")
How it works
Seed configs
│
├── Execute experiments (parallel)
├── Score results
├── Expand promising nodes (LLM or custom mutator)
├── Prune dead ends
└── Repeat until budget/target/plateau
Everything persists to SQLite. Kill the process, restart later, pick up where you left off.
Features
- Tree search strategies: UCB1 (explore/exploit balance), best-first (greedy), breadth-first (systematic), hybrid (adaptive phase switching)
- LLM-guided mutations: Uses any model via litellm to analyze results and suggest new configs. Falls back to random perturbation if no LLM available.
- Parallel execution: Run multiple experiments concurrently with configurable concurrency limits
- SQLite persistence: Every node, config, and result stored. Resume any search. Query history.
- Custom everything: Bring your own executor, evaluator, mutator, or strategy
- Shell executor: Point it at any training script. No code changes needed.
- CLI included:
arborist run,arborist status,arborist report - Budget controls: Cap by experiment count, wall time, dollar cost, or target score
Strategies
| Strategy | How it works | Best for |
|---|---|---|
ucb |
UCB1 bandit algorithm. Balances exploration of untried branches with exploitation of high scorers. | General use. Unknown search spaces. |
best_first |
Always expands the highest-scoring node. Pure exploitation. | When you already know a good region. |
breadth_first |
Level-by-level. Every node at depth N before any at N+1. | Systematic coverage. Small spaces. |
hybrid |
Starts with UCB exploration, detects plateau, switches to greedy hill-climb from the best node. Cycles back to explore if exploit stalls. | Best overall. Finds good regions fast, then squeezes out gains. |
llm_guided |
LLM picks which node to expand based on full tree context. | When you want the model to drive strategy. |
Real-world example: XGBoost tuning
from arborist import TreeSearch, ShellExecutor, NumericEvaluator
search = TreeSearch(
goal="Maximize macro F1 on multi-class classification",
executor=ShellExecutor(
command="python3 train.py --config {config_path}",
timeout=300,
),
evaluator=NumericEvaluator(field="f1"),
seed_configs=[
{"n_estimators": 100, "max_depth": 6, "learning_rate": 0.1},
{"n_estimators": 200, "max_depth": 4, "learning_rate": 0.05},
{"n_estimators": 500, "max_depth": 8, "learning_rate": 0.01},
],
strategy="hybrid",
concurrency=4,
max_experiments=100,
max_depth=5,
plateau_window=15,
db_path="./experiments.db",
verbose=True,
)
results = search.run()
print(results.report())
Your training script just needs to print JSON with the metric:
# train.py
import json, sys
config = json.load(open(sys.argv[2]))
# ... train your model ...
print(json.dumps({"f1": 0.847, "accuracy": 0.912}))
API Reference
TreeSearch
search = TreeSearch(
goal="...", # What you're optimizing
executor=my_fn, # callable(config) -> dict
score=lambda r: r["f1"], # callable(results) -> float
# Search control
strategy="hybrid", # ucb, best_first, breadth_first, hybrid, llm_guided
mutator=my_mutator, # Custom mutation function (optional, defaults to LLM)
concurrency=5, # Max parallel experiments
max_experiments=200, # Total experiment budget
max_depth=6, # Max tree depth
# Termination
target_score=0.95, # Stop when reached
plateau_window=20, # Stop if no improvement for N experiments
budget_usd=10.0, # LLM cost cap
# Storage
db_path="./arborist.db", # SQLite path (auto-created)
# Callbacks
on_node_complete=callback, # Called after each experiment
verbose=True,
)
results = search.run()
Results
results.best # Best node: config, score, full results
results.top_k(5) # Top 5 nodes
results.insights # Cross-branch pattern analysis
results.report() # Markdown summary
results.tree_id # Unique ID for resuming
Resume a search
search = TreeSearch.resume(
tree_id="abc123",
db_path="./arborist.db",
executor=my_fn,
score=my_score,
)
results = search.run() # Picks up where it left off
Custom mutator
def my_mutator(config, results, context):
"""Generate child configs from a parent experiment."""
return [
{**config, "lr": config["lr"] * 0.5},
{**config, "lr": config["lr"] * 2.0},
{**config, "n_estimators": config["n_estimators"] + 100},
]
Custom executor
from arborist import Executor, BranchContext
class MyExecutor(Executor):
def run(self, config: dict, context: BranchContext) -> dict:
# context.goal, context.depth, context.parent_config, etc.
model = train(**config)
return {"f1": model.f1, "accuracy": model.accuracy}
CLI
arborist run --config search.yaml # Run from YAML config
arborist status --db ./arborist.db # Check progress
arborist report --tree-id ID # Generate markdown report
arborist list --db ./arborist.db # List all searches
arborist node NODE_ID # Inspect a specific node
arborist prune NODE_ID --reason "..." # Manually kill a branch
YAML config
goal: "Maximize F1 for multi-class classification"
strategy: hybrid
concurrency: 4
max_experiments: 100
max_depth: 5
db_path: ./arborist.db
executor:
type: shell
command: "python3 train.py --config {config_path}"
timeout: 300
evaluator:
type: numeric
field: f1
seed_configs:
- n_estimators: 100
max_depth: 6
learning_rate: 0.1
- n_estimators: 200
max_depth: 4
learning_rate: 0.05
termination:
target_score: 0.95
plateau_window: 15
Design
- Local-first. SQLite storage, no cloud, no accounts, no telemetry.
- LLM-agnostic. Any provider via litellm (OpenAI, Anthropic, Google, Ollama, etc).
- Composable. Swap out any component: executor, evaluator, mutator, strategy.
- Resumable. Full state in SQLite. Kill and restart without losing work.
- Observable. Verbose logging, per-node callbacks, CLI status checks.
Contributing
See CONTRIBUTING.md for development setup and guidelines.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file arborist_ai-0.1.0.tar.gz.
File metadata
- Download URL: arborist_ai-0.1.0.tar.gz
- Upload date:
- Size: 343.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9f971eb52a8c39f72f1e151e028cfcebf1d81746c7ada321114973173d7fb988
|
|
| MD5 |
f7dc29479d207d6f131d30a38c978b05
|
|
| BLAKE2b-256 |
31b952ba8acc01ff30e26ce7eab4d753f694926ad39743e0c1f6e23025859a2c
|
File details
Details for the file arborist_ai-0.1.0-py3-none-any.whl.
File metadata
- Download URL: arborist_ai-0.1.0-py3-none-any.whl
- Upload date:
- Size: 39.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b35898300912f6c01d9385e19fe211d66862fa13286b4ddd50bd79bb1a29460a
|
|
| MD5 |
174c9c64d36a8cf70b0d3f8106a3a462
|
|
| BLAKE2b-256 |
2341538877f39625ec31260021ce5b32ce7028d7ec9c6af3a914c8316f1a03e8
|