Agnostic autonomous improvement engine — point it at any codebase, it makes it measurably better overnight

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

tccw

These details have not been verified by PyPI

Project description

The Cloud Clock Work - Autoresearch

A Claude Code wrapper that makes any codebase measurably better overnight.

pip install tccw-autoresearch
cd your-project
autoresearch init

Claude opens. It scans your project. Asks what you want to improve. Configures everything. Runs the first experiment. Done.

How It Works

flowchart TD
    subgraph INSTALL["<b>1 ◼ INSTALL</b>"]
        direction LR
        A1["pip install<br/>tccw-autoresearch"]:::install
        A2["cd your-project"]:::install
        A1 --> A2
    end

    subgraph ONBOARD["<b>2 ◼ ONBOARD</b>"]
        direction TB
        B1["autoresearch init"]:::cmd
        B2["Claude Code opens<br/>interactive session"]:::claude
        B3["Scans your project<br/>detects tech stack"]:::claude
        B4["Asks: what to improve?<br/>lint · coverage · build time"]:::question
        B5["Asks: which files<br/>can the agent edit?"]:::question
        B6["Writes config + measures baseline"]:::config
        B1 --> B2 --> B3 --> B4 --> B5 --> B6
    end

    subgraph CONFIG["<b>3 ◼ .autoresearch/</b>"]
        direction LR
        C1["config.yaml<br/>metric · rules · budget"]:::config
        C2["agents/default/<br/>CLAUDE.md · settings.json"]:::config
        C3["state.json<br/>progress · results"]:::state
    end

    subgraph ENGINE["<b>4 ◼ EXPERIMENT LOOP</b>"]
        direction TB
        D1["Create git worktree<br/>isolated branch"]:::engine
        D2["Spawn Claude Code agent<br/>with budget + permissions"]:::claude
        D3["Agent reads code<br/>forms hypothesis<br/>edits mutable files"]:::claude
        D4["Run metric command<br/>extract number"]:::metric
        D5{{"Improved?"}}:::decision
        D6["KEEP<br/>git commit"]:::keep
        D7["DISCARD<br/>git reset"]:::discard
        D1 --> D2 --> D3 --> D4 --> D5
        D5 -->|"Yes"| D6
        D5 -->|"No"| D7
        D6 --> D8
        D7 --> D8["Next experiment<br/>or budget exhausted"]:::engine
    end

    subgraph PUBLISH["<b>5 ◼ RESULTS</b>"]
        direction LR
        E1["git push branch"]:::publish
        E2["Create PR<br/>with audit trail"]:::publish
        E3["Auto-merge<br/>to dev"]:::publish
        E1 --> E2 --> E3
    end

    INSTALL --> ONBOARD
    ONBOARD --> CONFIG
    CONFIG --> ENGINE
    ENGINE --> PUBLISH

    classDef install fill:#1e3a5f,stroke:#4a9eff,color:#fff,stroke-width:2px
    classDef cmd fill:#0d1117,stroke:#58a6ff,color:#58a6ff,stroke-width:2px
    classDef claude fill:#6e40c9,stroke:#a371f7,color:#fff,stroke-width:2px
    classDef question fill:#8b5cf6,stroke:#c4b5fd,color:#fff,stroke-width:2px
    classDef config fill:#0e4429,stroke:#3fb950,color:#fff,stroke-width:2px
    classDef state fill:#1a4731,stroke:#56d364,color:#fff,stroke-width:1px
    classDef engine fill:#1c1917,stroke:#f97316,color:#fff,stroke-width:2px
    classDef metric fill:#713f12,stroke:#fbbf24,color:#fff,stroke-width:2px
    classDef decision fill:#854d0e,stroke:#facc15,color:#fff,stroke-width:3px
    classDef keep fill:#14532d,stroke:#22c55e,color:#fff,stroke-width:3px
    classDef discard fill:#7f1d1d,stroke:#ef4444,color:#fff,stroke-width:3px
    classDef publish fill:#1e3a5f,stroke:#38bdf8,color:#fff,stroke-width:2px

You define the metric. Claude does the work. AutoResearch decides what to keep.

Reduce lint errors. Increase test coverage. Cut build times. Fix code smells. Anything you can measure with a shell command.

`autoresearch init` in action

AutoResearch onboard wizard — Claude scans your project, checks prerequisites, and configures everything

Claude asks which files the agent can edit — mutable vs immutable boundaries

Review and submit — mutable and immutable file boundaries confirmed

Generated config.yaml — complete marker with metric, guard, loop, agent, and schedule

`autoresearch run` — live experiment progress

CLI progress panel — live-updating experiment table with kept/discarded/crashed status and metric deltas

What Is This?

AutoResearch is an orchestrator that sits on top of Claude Code. It gives Claude a metric, a budget, and permission to edit specific files — then measures whether the code got better.

Requirements

Python 3.10+
Claude Code installed and authenticated (claude --version)

That's it. No Docker. No GPU. No ML frameworks. No infrastructure.

How It Works

1. Install

pip install tccw-autoresearch

2. Initialize (AI-guided)

cd your-project
autoresearch init

This spawns an interactive Claude Code session with the /onboard skill:

Claude scans your project, detects the tech stack
Asks what you want to improve (lint, coverage, build time, custom)
Suggests the right metric command for your stack
Asks which files are mutable (agent can edit) vs immutable (never touched)
Writes .autoresearch/config.yaml
Measures baseline
Optionally runs the first experiment

Use --no-claude to skip the wizard and edit config manually.

3. Run

# Interactive TUI — discovers markers in current directory
autoresearch

# Run all active markers in current directory
autoresearch run

# Run a specific marker
autoresearch run -m lint-quality

# Headless (CI/CD, cron, scripts)
autoresearch --headless run -m lint-quality

4. What Happens

LOOP (per experiment):
  1. Create git worktree (isolated branch)
  2. Spawn Claude Code agent with your rules
  3. Agent edits mutable files, guided by issues_command
  4. Run metric command → extract number
  5. If improved → keep (git commit)
  6. If worse → discard (git reset)
  7. Budget countdown warns agent when time is low
  8. REPEAT until budget exhausted or HALT

Every kept experiment is a commit. The engine pushes the branch and creates a PR to your configured target_branch with full audit trail.

The Marker File

.autoresearch/config.yaml is the only config you need:

markers:
  - name: lint-quality
    description: "Reduce lint errors"
    status: active
    target:
      mutable: ["src/**/*.py"]        # Agent CAN edit these
      immutable: ["tests/**/*.py"]    # Agent CANNOT touch these
    metric:
      command: "ruff check src/ 2>&1"
      extract: "grep -oP 'Found \\K\\d+'"
      direction: lower                # lower = better
      baseline: 163
      issues_command: "ruff check src/ --output-format concise | head -30"
    agent:
      name: default
      model: sonnet                   # Claude model to use
      effort: medium                  # low | medium | high
      permission_mode: bypassPermissions
      budget_per_experiment: 20m      # time limit per experiment
      max_experiments: 10             # experiments per run
      env_file: null                  # path to .env file (optional)
      allowed_tools: []
      disallowed_tools: []
    auto_merge:
      enabled: true                   # auto-merge PRs when experiments succeed
      target_branch: main             # PR target branch
    schedule:
      type: on-demand                 # on-demand | overnight | weekend | cron

Common Metrics

Goal	`metric.command`	`metric.extract`	`direction`
Reduce lint errors	`ruff check src/ 2>&1`	`grep -oP 'Found \K\d+'`	lower
Increase test passes	`pytest tests/ -q --tb=no 2>&1 \| tail -1`	`grep -oP '\d+(?= passed)'`	higher
Increase coverage %	`pytest --cov=src --cov-report=term 2>&1 \| tail -1`	`grep -oP '\d+(?=%)'`	higher
Reduce build time	`bash -c 'TIMEFORMAT=%R; time make build 2>&1'`	`tail -1`	lower
Reduce ESLint errors	`npx eslint src/ 2>&1 \| tail -1`	`grep -oP '\d+(?= problem)'`	lower

Any shell command that outputs a number works.

Claude Code Integration

AutoResearch is built around Claude Code at every level:

Feature	How Claude Code Is Used
Experiments	Each experiment spawns `claude` as a subprocess with `--permission-mode`, `--allowedTools`, `--disallowedTools`
Agent profiles	`.autoresearch/agents/default/CLAUDE.md` + `settings.json` — shipped with the package
Budget countdown	PostToolUse hook injects remaining time via `additionalContext` after every tool call
`autoresearch init`	Spawns interactive Claude Code session with `/onboard` skill
`/autoresearch` skill	When you open the project in Claude Code: run, status, logs
`/onboard` skill	Interactive wizard — scans project, configures marker, measures baseline
Issues injection	`issues_command` output injected into agent prompt — exact `file:line:rule` targets
Telemetry	Parses Claude Code `stream-json` output for tokens, cost, tools, errors

Custom Agent Profiles

The default agent ships with the package. Duplicate to customize:

cp -r .autoresearch/agents/default .autoresearch/agents/my-agent
# Edit my-agent/CLAUDE.md, settings.json, hooks/
# Update config.yaml: agent.name: my-agent

Custom agents inherit default hooks (budget countdown) via symlinks.

Intelligence Features

Feature	Description
Issues command	Agent gets exact `file:line:rule` instead of exploring broadly
Budget countdown	3 tiers: working → wrap up → COMMIT NOW
Graduated escalation	3 fails → refine → 5 → pivot → search → halt
Statistical confidence	MAD-based scoring after 3+ experiments
Dual-gate guard	Metric + regression guard — prevents gaming
Ideas backlog	Failed experiments log why — future sessions don't repeat
Auto-merge	Every KEEP → push branch → PR to target_branch → squash merge
Always-commit	Engine commits after agent exits regardless of timeout

CLI Reference

autoresearch                # Interactive TUI (discovers markers in CWD)
autoresearch init           # AI-guided setup (spawns Claude Code)
autoresearch run            # Run all active markers in CWD
autoresearch run -m <name>  # Run a specific marker
autoresearch status         # Marker dashboard
autoresearch results        # Experiment history
autoresearch confidence     # Statistical confidence scores
autoresearch ideas          # Ideas backlog
autoresearch clean          # Delete stale experiment branches
autoresearch clean --remote # Also delete remote branches
autoresearch finalize       # Cherry-pick kept experiments into clean branch
autoresearch merge          # Merge finalized branch into target
autoresearch add            # Register marker in global state (for daemon)
autoresearch detach         # Unregister a marker
autoresearch skip           # Skip current experiment
autoresearch pause          # Pause a marker
autoresearch daemon start   # Scheduled overnight runs
autoresearch daemon stop    # Stop the daemon
autoresearch daemon status  # Check daemon status
autoresearch daemon logs    # View daemon logs

All commands support --headless for JSON output. No autoresearch add needed for local runs — the CLI reads .autoresearch/config.yaml from CWD.

Production Results

Deployed on antoncore (3.3k LOC Python monorepo):

Run	Baseline	Result	Delta	Audit
1	186 errors	163	-23	manual merge
2	163 errors	133	-30	PR #218 (merged)
3	133 errors	0	-133	PR #219 (merged)

186 → 0 ruff errors in 3 cycles. Full GitHub PR audit trail.

Architecture

src/autoresearch/
  cli.py             # 13 commands, interactive + headless
  engine.py          # Core loop + AgentRunner + auto-publish
  marker.py          # config.yaml schema (Pydantic)
  worktree.py        # Git worktree isolation
  metrics.py         # Metric extraction + MAD confidence
  program.py         # Runtime prompt generation
  agent_profile.py   # Claude Code settings + CLAUDE.md generation
  gates.py           # Security + test + confidence gates
  telemetry.py       # stream-json parsing
  daemon.py          # Cron scheduling
  skills/            # /onboard + /autoresearch Claude Code skills
  agents/default/    # Shipped agent profile

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

tccw

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.3.3

Apr 6, 2026

0.2.0

Apr 6, 2026

0.1.0

Apr 5, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

tccw_autoresearch-0.3.3-py3-none-any.whl (94.4 kB view details)

Uploaded Apr 6, 2026 Python 3

File details

Details for the file tccw_autoresearch-0.3.3-py3-none-any.whl.

File metadata

Download URL: tccw_autoresearch-0.3.3-py3-none-any.whl
Upload date: Apr 6, 2026
Size: 94.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for tccw_autoresearch-0.3.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b857c2cfbad8eb30cc50a5d334789675ed239a1f47e44ecc00945a27a72a0327`
MD5	`b70905ba2552f6cb73185250c6a23ec8`
BLAKE2b-256	`e4d16e28dbcb45ef21c3e3ae481f42b30aeca30bdd64dc297bc69113586ec97f`

See more details on using hashes here.

Provenance

The following attestation bundles were made for tccw_autoresearch-0.3.3-py3-none-any.whl:

Publisher: workflow.yaml on The-Cloud-Clock-Work/tccw-autoresearch

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: tccw_autoresearch-0.3.3-py3-none-any.whl
- Subject digest: b857c2cfbad8eb30cc50a5d334789675ed239a1f47e44ecc00945a27a72a0327
- Sigstore transparency entry: 1244510072
- Sigstore integration time: Apr 6, 2026
Source repository:
- Permalink: The-Cloud-Clock-Work/tccw-autoresearch@db8842b6f388e723cbca574bfe40592a6913e9d0
- Branch / Tag: refs/tags/v0.3.3
- Owner: https://github.com/The-Cloud-Clock-Work
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: workflow.yaml@db8842b6f388e723cbca574bfe40592a6913e9d0
- Trigger Event: push

tccw-autoresearch 0.3.3

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

The Cloud Clock Work - Autoresearch

How It Works

autoresearch init in action

autoresearch run — live experiment progress

What Is This?

Requirements

How It Works

1. Install

2. Initialize (AI-guided)

3. Run

4. What Happens

The Marker File

Common Metrics

Claude Code Integration

Custom Agent Profiles

Intelligence Features

CLI Reference

Production Results

Architecture

Links

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes

Provenance

`autoresearch init` in action

`autoresearch run` — live experiment progress