Point it at any repo — sow ideas, run experiments, and harvest better code autonomously.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

shatianming

These details have not been verified by PyPI

Project description

🧑‍🌾 PaperFarm: Planting GPUs & APIs 🌱, Harvesting Papers & SOTAs 🌾

🔬 Point it at any repo — sow ideas, run experiments, and harvest better code autonomously

🌱 Sow ideas. 🚜 Run experiments. 🌾 Harvest evidence. 📄

Quick Start · How It Works · Agents · TUI Dashboard · CLI Reference · Examples

🌾 Key Features

🚀 One run Command: paperfarm run . bootstraps a scout analysis, then enters the research loop — plan, review, experiment, repeat.
🤖 Multi-Agent Support: Works with Claude Code, Codex CLI, Aider, and Gemini CLI — pick your favorite.
🔬 Skill-Based Loop: Scout → Manager → Critic → Experiment — each phase is a markdown "skill" that an agent executes faithfully.
🖥️ Research TUI: Live dashboard with frontier status, metric charts, and structured log viewer. Keyboard controls for pause/resume/skip.
🛡️ Safety First: Every experiment is a git commit. Failed experiments auto-rollback via rollback.sh. Results logged to results.tsv with FileLock concurrency safety.
📡 Headless Mode: --headless for CI, scripts, or remote servers — no TUI needed.
⚡ Parallel Workers: Run experiments across multiple GPUs in isolated git worktrees — workers can't interfere with each other.

🌱 Quick Start

pip install PaperFarm

cd your-project
paperfarm run .

This launches a research session:

🌱 Scout — survey the field: analyze your codebase, search related work, design evaluation metrics
🚜 Manager — plan the crop: propose hypotheses, design experiments, maintain the frontier backlog
🔍 Critic — inspect the plan: review experiment specs before execution, review evidence after
🌾 Experiment — plant, test, harvest: implement one change, evaluate, record to results.tsv
🔄 Repeat — until all frontier items are done or max_rounds is reached

Headless Mode

paperfarm run . --headless \
  --goal "Reduce val_loss below 0.3" \
  --agent-name codex

Parallel Workers

paperfarm run . --headless --workers 4 --agent-name codex

🚜 How It Works

PaperFarm creates a .research/ directory in your repo with everything needed for autonomous research.

📂 .research/ Directory Structure

File	Purpose
`config.yaml`	Research configuration (metrics, limits, agent settings)
`graph.json`	Hypothesis → experiment spec → frontier → evidence graph
`results.tsv`	Experiment results ledger (timestamp, frontier_id, status, metric, value)
`activity.json`	Live phase/worker status for TUI polling
`log.jsonl`	Append-only structured event log
`evaluation.md`	How to measure the primary metric (written by scout)
`project-understanding.md`	Project analysis (written by scout)
`research-strategy.md`	Research direction and focus areas (written by scout)
`literature.md`	Related work and prior art (written by scout)
`scripts/record.py`	Helper script agents call to append results (FileLock-safe)
`scripts/rollback.sh`	Helper script to revert failed experiments

🔄 The Research Loop

Bootstrap
  └─ Scout — analyze codebase, define strategy and evaluation

Research Loop (repeats until done)
  ├─ Manager  — propose hypotheses, design experiments, maintain frontier
  ├─ Critic   — preflight review: approve or reject experiment specs
  ├─ Experiment — claim frontier item, implement change, evaluate, record
  └─ Critic   — post-run review: assess evidence, update claims

Each phase is a markdown skill template (skills/*.md) loaded by SkillRunner, variable-substituted with [GOAL] and [TAG], then passed to the agent as a prompt. The agent reads/writes .research/ state files directly.

🧰 Skill Templates

Skill	Role	What It Does
`scout.md`	Bootstrap	Analyze project, search related work, define strategy and evaluation
`manager.md`	Planning	Propose hypotheses, design experiment specs, populate frontier
`critic.md`	Review	Pre-approve experiments (preflight), post-review evidence (post-run)
`experiment.md`	Execution	Claim frontier item, implement, evaluate, record via `record.py`

Skills reference these .research/ files directly. The experiment agent calls python .research/scripts/record.py --frontier-id F-1 --status keep --value 0.87 to record results, and bash .research/scripts/rollback.sh to revert failed changes.

🛡️ Field Safety

Feature	Description
Isolated git commits	Every experiment is a separate commit — nothing is lost
Auto-rollback	Failed experiments are reverted via `rollback.sh`
FileLock results	`record.py` uses FileLock for concurrent-safe writes to `results.tsv`
Max rounds	Stops after N rounds (`config.yaml: limits.max_rounds`)
Pause / Resume / Skip	TUI keyboard controls or `activity.json` control flags
Parallel isolation	Workers run in separate git worktrees — no interference

🤖 Supported Agents

Agent	Flag	How It's Invoked
Claude Code	`--agent-name claude-code`	`claude -p <prompt> --verbose`
Codex CLI	`--agent-name codex`	`codex exec --full-auto <prompt>`
Aider	`--agent-name aider`	`aider --yes-always --no-git --message-file <file>`
Gemini CLI	`--agent-name gemini`	`gemini -p <prompt>`

Default is claude-code. All agents receive the same skill prompt and work against the same .research/ state files.

📊 Interactive TUI Dashboard

Launch with TUI (default, no --headless):

paperfarm run . --agent-name claude-code

PaperFarm overview dashboard

┌──────────────────────── PaperFarm ────────────────────────┐
│ Phase: experiment | Round: 3 | Hyps: 5 | Exps: 4/7 | Best: 1.92 │
│ scout  ‣  manager  ‣  critic  ‣  EXPERIMENT               │
├──[Execution]──[Metrics]──[Logs]────────────────────────────┤
│                                                             │
│  Frontier Panel              │  Worker Panel                │
│  frontier-001  keep   2.62   │  (idle)                      │
│  frontier-002  keep   2.40   │                              │
│  frontier-003  keep   2.31   │                              │
│  frontier-006  keep   1.92   │                              │
│                              │                              │
├─────────────────────────────────────────────────────────────┤
│ p Pause   r Resume   s Skip   q Quit             ^p palette│
└─────────────────────────────────────────────────────────────┘

📑 3 Tabs & Keyboard Shortcuts

3 tabs:

Execution — Frontier items with status/priority, worker activity panel
Metrics — Experiment results chart over time
Logs — Structured event log from log.jsonl

Keyboard shortcuts: p pause, r resume, s skip current experiment, q quit.

Polls .research/ state files every second — attach to a running session anytime to monitor progress.

🚜 Installation

Python 3.10+ required. Supports Linux, macOS, and Windows.

pip install (recommended)

pip install PaperFarm

cd your-project
paperfarm run .

From source (for development)

git clone https://github.com/shatianming5/PaperFarm.git
cd PaperFarm
pip install -e ".[dev]"
pytest

🖥️ CLI Reference

paperfarm run REPO [OPTIONS]    Launch or resume a research session
paperfarm status REPO           Show current research state
paperfarm results REPO          Display experiment results table

`run` Options

Option	Default	Description
`--goal TEXT`	`""`	Research goal (injected into skill templates as `[GOAL]`)
`--tag TEXT`	auto	Session tag (injected as `[TAG]`)
`--workers N`	`0`	Parallel workers (0 = serial)
`--headless`	off	Run without TUI
`--agent-name TEXT`	`claude-code`	Which agent CLI to use

⚙️ Configuration

The scout agent fills .research/config.yaml during bootstrap. You can also edit it manually:

protocol: research-v1

metrics:
  primary:
    name: val_loss           # or test_accuracy, ops_per_sec, etc.
    direction: minimize      # minimize | maximize

limits:
  max_rounds: 20             # max research loop iterations
  timeout_minutes: 0         # 0 = no timeout

workers:
  max: 0                     # 0 = serial
  gpu_mem_per_worker_mb: 8192

agent:
  name: claude-code
  config: {}                 # passed to agent adapter

🏡 Project Structure

src/paperfarm/
├── cli.py              # Typer CLI (run / status / results)
├── agent.py            # Agent adapters (ClaudeCode, Codex, Aider, Gemini)
├── skill_runner.py     # Loads skills, substitutes [GOAL]/[TAG], drives the loop
├── state.py            # .research/ state file access layer
├── parallel.py         # WorkerPool for multi-GPU parallel experiments
├── skills/
│   ├── protocol.yaml   # Bootstrap + loop step order
│   ├── scout.md        # 🌱 Scout skill template
│   ├── manager.md      # 🚜 Manager skill template
│   ├── critic.md       # 🔍 Critic skill template
│   ├── experiment.md   # 🌾 Experiment skill template
│   └── scripts/
│       ├── record.py   # CLI tool for recording results (FileLock-safe)
│       └── rollback.sh # Revert failed experiments
└── tui/
    ├── app.py          # Textual TUI app (polling-based)
    ├── widgets.py      # StatsBar, PhaseStrip, FrontierPanel, etc.
    └── styles.css      # TUI styling

🌽 Examples

See examples/ for ready-to-run setups:

Example	Task	Metric	Result
🎮 CartPole RL	Maximize DQN reward on CartPole-v1	avg_reward	266.7
⚡ Code Perf	Optimize JSON parser throughput	ops/sec	45K → 545K
🧠 nanoGPT	Reduce Shakespeare char-level val_loss	val_loss	2.62 → 1.92 (-27%)
🖼️ CIFAR-10	Maximize CIFAR-10 test accuracy	test_accuracy	67.7% (WIP)
📦 YOLO Tiny	Maximize YOLOv8 mAP50 on COCO8	mAP50	0.875
📝 HF GLUE	Optimize SST-2 fine-tuning	eval_accuracy	(needs GPU)
🎙️ Whisper	Reduce Whisper word error rate	WER	(needs GPU)
🔥 Liger-Kernel	Optimize Triton GPU kernels	throughput	(needs GPU)

Running an Example

cd examples/cartpole
paperfarm run . --agent-name codex --headless \
  --goal "Maximize CartPole-v1 average reward to 500"

🧑‍🌾 Contributing

Contributions are welcome! Please:

Open an issue to discuss the proposed change
Fork the repository and create your feature branch
Submit a pull request with a clear description

📄 License

This project is licensed under the MIT License.

Star History

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

shatianming

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.2.0b1 pre-release

Mar 18, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

paperfarm-0.2.0b1.tar.gz (1.4 MB view details)

Uploaded Mar 18, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

paperfarm-0.2.0b1-py3-none-any.whl (40.7 kB view details)

Uploaded Mar 18, 2026 Python 3

File details

Details for the file paperfarm-0.2.0b1.tar.gz.

File metadata

Download URL: paperfarm-0.2.0b1.tar.gz
Upload date: Mar 18, 2026
Size: 1.4 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for paperfarm-0.2.0b1.tar.gz
Algorithm	Hash digest
SHA256	`95626efee8d7490f585b83b732428e7ea70024c6a40617917e9bd7558f08ba68`
MD5	`e731e5a8ee551a4b92eec45af4313a91`
BLAKE2b-256	`711fdc5cb9335fc762e5b611cf139187ee6d25a30edb216b13f5a6cd8eea76c5`

See more details on using hashes here.

Provenance

The following attestation bundles were made for paperfarm-0.2.0b1.tar.gz:

Publisher: publish.yml on shatianming5/PaperFarm

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: paperfarm-0.2.0b1.tar.gz
- Subject digest: 95626efee8d7490f585b83b732428e7ea70024c6a40617917e9bd7558f08ba68
- Sigstore transparency entry: 1122699476
- Sigstore integration time: Mar 18, 2026
Source repository:
- Permalink: shatianming5/PaperFarm@1912c43ac3f97ad73d0deed867cb8bed71d62fee
- Branch / Tag: refs/tags/v0.2.0b1
- Owner: https://github.com/shatianming5
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@1912c43ac3f97ad73d0deed867cb8bed71d62fee
- Trigger Event: push

File details

Details for the file paperfarm-0.2.0b1-py3-none-any.whl.

File metadata

Download URL: paperfarm-0.2.0b1-py3-none-any.whl
Upload date: Mar 18, 2026
Size: 40.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for paperfarm-0.2.0b1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`fc9cbdb7a11f552b03d69bcf0a2d1b2cff2292f4871fbb973d07f973c3bbae2d`
MD5	`7849cae5b629a54aeee55c9142816a56`
BLAKE2b-256	`2ff1fcfafbcfcb40d523a8f5351ae41a9cab633526bd918e2023867ff3de1dca`

See more details on using hashes here.

Provenance

The following attestation bundles were made for paperfarm-0.2.0b1-py3-none-any.whl:

Publisher: publish.yml on shatianming5/PaperFarm

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: paperfarm-0.2.0b1-py3-none-any.whl
- Subject digest: fc9cbdb7a11f552b03d69bcf0a2d1b2cff2292f4871fbb973d07f973c3bbae2d
- Sigstore transparency entry: 1122699483
- Sigstore integration time: Mar 18, 2026
Source repository:
- Permalink: shatianming5/PaperFarm@1912c43ac3f97ad73d0deed867cb8bed71d62fee
- Branch / Tag: refs/tags/v0.2.0b1
- Owner: https://github.com/shatianming5
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@1912c43ac3f97ad73d0deed867cb8bed71d62fee
- Trigger Event: push

PaperFarm 0.2.0b1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

🧑‍🌾 PaperFarm: Planting GPUs & APIs 🌱, Harvesting Papers & SOTAs 🌾

🔬 Point it at any repo — sow ideas, run experiments, and harvest better code autonomously

🌾 Key Features

🌱 Quick Start

Headless Mode

Parallel Workers

🚜 How It Works

🛡️ Field Safety

🤖 Supported Agents

📊 Interactive TUI Dashboard

🚜 Installation

pip install (recommended)

From source (for development)

🖥️ CLI Reference

run Options

⚙️ Configuration

🏡 Project Structure

🌽 Examples

Running an Example

🧑‍🌾 Contributing

📄 License

Star History

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

`run` Options