Skip to main content

Framework for reproducible autonomous research loops

Project description

helix

code checks unit tests codecov PyPI GitHub License

Inspired by karpathy/autoresearch, helix generalizes the idea of autonomous AI research loops beyond LLM training. Give an agent a codebase, a metric, and a fixed time budget. It experiments overnight. You wake up to results.

The git history is the research trail. experiments.tsv is the proof. Anyone can clone a helix, run it on their hardware, and independently verify every result.

Concepts

Term Meaning
helix A git repo containing helix.yaml + program.md + a codebase the agent can modify
helix.yaml Machine-readable spec: what to optimize, how to measure it, which files are editable
program.md Human-written instructions for the agent: domain knowledge, constraints, techniques to try
experiments.tsv Append-only ledger of every experiment: commit, metric, status, description
helix run CLI command that launches an autonomous session on your hardware

Quick start

helix is agent-agnostic. Pick a backend or bring your own.

Backend Install Requires
ClaudeBackend (default) pip install 'helices[claude]' Claude Code CLI
GeminiBackend pip install helices Gemini CLI: npm install -g @google/gemini-cli
Custom pip install helices Implement the AgentBackend protocol

Start from a template

helix init my-project --template generic --domain "AI/ML" --description "Optimize X for task Y."
cd my-project
git init
helix run

Run an existing helix

# from within a helix directory (one that has helix.yaml)
helix run              # start a session tagged with today's date
helix run --tag exp1   # custom tag
helix status           # show current best and recent experiments

Templates

Template Description
generic Blank slate: solver.py + evaluate.py. Print score: <value> at the end.
ai-inference LLM inference throughput on WikiText-2. Metrics: tokens_per_sec + bpb.

Examples

helix-examples is a curated gallery of standalone helices, each in its own repo and included as a git submodule.

git clone --recurse-submodules git@github.com:VectorInstitute/helix-examples.git
cd helix-examples/inference-opt
uv run prepare.py   # one-time: download model + dataset
helix run

The first example, helix-inference-opt, optimizes inference throughput for a causal language model on WikiText-2. The agent modifies infer.py (batching, quantization, torch.compile, etc.) and automatically merges improvements back to main.

Writing your own helix

  1. Create a new git repo.
  2. Add helix.yaml describing your metric, evaluation command, and editable scope.
  3. Add program.md with domain-specific instructions for the agent.
  4. Add your codebase.
  5. Run helix run.

Minimal helix.yaml:

name: my-helix
domain: AI/ML
description: Optimize X for task Y.

scope:
  editable: [solver.py]
  readonly: [evaluate.py, program.md, helix.yaml]

metrics:
  primary:
    name: accuracy
    optimize: maximize
  evaluate:
    command: python evaluate.py
    timeout_seconds: 120
    output_format: pattern
    patterns:
      primary: '^accuracy:\s+([\d.]+)'

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

helices-0.1.0.tar.gz (113.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

helices-0.1.0-py3-none-any.whl (31.6 kB view details)

Uploaded Python 3

File details

Details for the file helices-0.1.0.tar.gz.

File metadata

  • Download URL: helices-0.1.0.tar.gz
  • Upload date:
  • Size: 113.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for helices-0.1.0.tar.gz
Algorithm Hash digest
SHA256 317bd9e060f6850df92b9c9d551a1b84f5f6b288076b5762a34443641711e6c9
MD5 0d2787046ec9bf08246eda3d9fc5c486
BLAKE2b-256 e2199629d7a63ff421c36d982cab751d3abe1e928c5728effcf9e4cc2a96b67b

See more details on using hashes here.

File details

Details for the file helices-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: helices-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 31.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for helices-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 540c25ce9dfe289bb8d8bee2b1cf011fa646579444a49591555138b25c4d886b
MD5 acbb23e03144764926c5d4e6a46b359e
BLAKE2b-256 2ffd9ba05705b006f49fd985ac81b1716955a60dab8edcd5ffda09a30d43886a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page