Skip to main content

Framework for reproducible autonomous research loops

Project description

helix

code checks unit tests codecov PyPI GitHub License

Inspired by karpathy/autoresearch, helix generalizes the idea of autonomous AI research loops beyond LLM training. Give an agent a codebase, a metric, and a fixed time budget. It experiments overnight. You wake up to results.

The git history is the research trail. experiments.tsv is the proof. Anyone can clone a helix, run it on their hardware, and independently verify every result.

Concepts

Term Meaning
helix A git repo containing helix.yaml + program.md + a codebase the agent can modify
helix.yaml Machine-readable spec: what to optimize, how to measure it, which files are editable
program.md Human-written instructions for the agent: domain knowledge, constraints, techniques to try
experiments.tsv Append-only ledger of every experiment: commit, metric, status, description
helix run CLI command that launches an autonomous session on your hardware

Quick start

helix is agent-agnostic. Pick a backend or bring your own.

Backend Install Requires
ClaudeBackend (default) pip install 'helices[claude]' Claude Code CLI
GeminiBackend pip install helices Gemini CLI
Custom pip install helices Implement the AgentBackend protocol

Start from a template

helix init my-project --template generic --domain "AI/ML" --description "Optimize X for task Y."
cd my-project
git init
helix run

Run an existing helix

# from within a helix directory (one that has helix.yaml)
helix run              # start a session tagged with today's date
helix run --tag exp1   # custom tag
helix status           # show current best and recent experiments

Templates

Template Description
generic Blank slate: solver.py + evaluate.py. Print score: <value> at the end.
ai-inference LLM inference throughput on WikiText-2. Metrics: tokens_per_sec + bpb.

Examples

helix-examples is a curated gallery of standalone helices, each in its own repo and included as a git submodule.

git clone --recurse-submodules git@github.com:VectorInstitute/helix-examples.git
cd helix-examples/inference-opt
uv run prepare.py   # one-time: download model + dataset
helix run

The first example, helix-inference-opt, optimizes inference throughput for a causal language model on WikiText-2. The agent modifies infer.py (batching, quantization, torch.compile, etc.) and automatically merges improvements back to main.

Writing your own helix

  1. Create a new git repo.
  2. Add helix.yaml describing your metric, evaluation command, and editable scope.
  3. Add program.md with domain-specific instructions for the agent.
  4. Add your codebase.
  5. Run helix run.

Minimal helix.yaml:

name: my-helix
domain: AI/ML
description: Optimize X for task Y.

scope:
  editable: [solver.py]
  readonly: [evaluate.py, program.md, helix.yaml]

metrics:
  primary:
    name: accuracy
    optimize: maximize
  evaluate:
    command: python evaluate.py
    timeout_seconds: 120
    output_format: pattern
    patterns:
      primary: '^accuracy:\s+([\d.]+)'

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

helices-0.1.1.tar.gz (113.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

helices-0.1.1-py3-none-any.whl (31.8 kB view details)

Uploaded Python 3

File details

Details for the file helices-0.1.1.tar.gz.

File metadata

  • Download URL: helices-0.1.1.tar.gz
  • Upload date:
  • Size: 113.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for helices-0.1.1.tar.gz
Algorithm Hash digest
SHA256 edacd48f0aa89d27b453acf27bb737e361e79567be289d06486d15270d6ef051
MD5 882b7b5e67738bad2fb9c7f223ba9f53
BLAKE2b-256 40d2c1798463806a7269cf8618e40e47ba281505b530315f3ea101822bcdb199

See more details on using hashes here.

File details

Details for the file helices-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: helices-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 31.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for helices-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 661c652b0327d02cd164dbef7fed508e3716255bf8bfd74548ef2586a7b24568
MD5 cfb25b017818b0d67e931dc3a04c5976
BLAKE2b-256 737c774815fbf73071de4be602148b86e435f1cfccbcd56662c9741114eaf24e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page