Skip to main content

LLM-guided ML optimization — point it at a training script, it reads the curves and designs better models

Project description

neuropt

Three robot researchers designing neural network architectures

An LLM reads your training curves and designs your next experiment.


Point it at a training script, let it run overnight. The LLM sees full per-epoch train/val curves, spots overfitting, and proposes what to try next — like a research assistant who never sleeps and actually reads the loss plots.

vs Optuna and random search

Benchmark: neuropt vs Optuna vs Random

Same 15-eval budget, 14-parameter CNN search space. These results use Claude Haiku 4.5 (the smallest and cheapest of their 4.5 models). We expect even stronger results with Sonnet or Opus. Optuna's TPE was configured with n_startup_trials=3 for a fair comparison (default is 10, which would make it purely random for most of the budget).

Quick start

pip install neuropt[llm]
export ANTHROPIC_API_KEY="sk-ant-..."

Option 1 — define what to search over:

# train.py
search_space = {
    "lr": (1e-4, 1e-1),                    # auto-detects log-scale
    "hidden_dim": (32, 512),                # auto-detects integer
    "activation": ["relu", "gelu", "silu"], # categorical
}

def train_fn(config):
    model = build_my_model(config["hidden_dim"], config["activation"])
    # ... train, return per-epoch losses for smarter LLM decisions ...
    return {"score": val_loss, "train_losses": [...], "val_losses": [...]}

Option 2 — just give it a model, we figure out the rest:

# train.py
model = torchvision.models.resnet18(num_classes=10)  # neuropt introspects this

def train_fn(config):
    m = config["model"].to("cuda")  # deep copy with modifications applied
    # ... train ...
    return {"score": val_loss, "train_losses": [...], "val_losses": [...]}

Then run:

neuropt run train.py

Runs until Ctrl+C. Crash-safe, resumable. Works in notebooks too:

from neuropt import ArchSearch

search = ArchSearch(train_fn=train_fn, search_space=search_space, backend="claude")
search.run(max_evals=50)

Documentation

See the full documentation for:

Installation

pip install neuropt                # core
pip install neuropt[llm]           # + Claude API (recommended)
pip install neuropt[llm-openai]    # + OpenAI API
pip install neuropt[all]           # everything

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

neuropt-0.3.0.tar.gz (3.0 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

neuropt-0.3.0-py3-none-any.whl (21.8 kB view details)

Uploaded Python 3

File details

Details for the file neuropt-0.3.0.tar.gz.

File metadata

  • Download URL: neuropt-0.3.0.tar.gz
  • Upload date:
  • Size: 3.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.6

File hashes

Hashes for neuropt-0.3.0.tar.gz
Algorithm Hash digest
SHA256 80afc1f1e293c738368fd72878abb201b43a67b6be7b12d2efc2faaf4db6a2f9
MD5 c2355b36808e3a89648fa7c05fb406be
BLAKE2b-256 e51cd202aa2ecbdbd2b2ac1bab2e43217cc673eb3465ad10cad23dce642b178a

See more details on using hashes here.

File details

Details for the file neuropt-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: neuropt-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 21.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.6

File hashes

Hashes for neuropt-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 94271713e1456dcb694c7ae0b515768294d1a0dae417659f8d3ece764186c085
MD5 c957ac5559d5850e44ed4737450440d4
BLAKE2b-256 ecc4cccee824b2f7e3049143b798512f14c8ecb388844d065157bc9eec546323

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page