Autonomous prompt optimization agents — autoresearch loop built on shonku

These details have not been verified by PyPI

Project links

Project description

Autoresearcher Shonku

Autonomous prompt optimization agents built on shonku.

Implements Karpathy's autoresearch pattern for prompts: propose an improvement, shadow-test it, measure, keep or discard, repeat.

Install

pip install autoresearcher-shonku

How it works

1. ANALYZE   -- read prompt metrics and sample interactions
2. PROPOSE   -- LLM generates an improved prompt version
3. VALIDATE  -- safety rails check (similarity, length, template vars)
4. DEPLOY    -- create experiment at low traffic weight
5. EVALUATE  -- collect metrics on the new version
6. DECIDE    -- keep if improved, discard if not
7. REPEAT

Agents

Agent	Role
`PromptAnalyzerAgent`	Analyzes metrics to identify weaknesses
`PromptOptimizerAgent`	Proposes improved prompt versions
`ExperimentManagerAgent`	Manages A/B experiment lifecycle
`AutoResearcherAgent`	Orchestrates the full loop

Usage

The autoresearcher does NOT own your data. You pass tools that wrap your storage. This works with any backend, not just autoresearch-prompt-manager.

Example: optimize email subject lines stored in a CSV

import csv
from autoresearcher_shonku import AutoResearcherAgent
from shonku import LLMConfig
from shonku.types import ToolSpec

# Your data lives wherever you want. Wrap access as tools.
subjects = {"welcome": {"body": "Welcome to our service", "version": 1}}
metrics = [{"quality": 5.2}, {"quality": 4.8}, {"quality": 6.0}]

def get_prompt(slug: str) -> str:
    import json
    s = subjects.get(slug, {})
    return json.dumps({"slug": slug, **s})

def get_metrics(prompt_id: str, version_id: str, metric_name: str = "quality") -> str:
    import json
    vals = [m.get(metric_name, 0) for m in metrics]
    return json.dumps({"count": len(vals), "mean": sum(vals)/len(vals)})

def get_sample_interactions(prompt_id: str, limit: str = "3") -> str:
    return '[{"feedback": "too generic"}, {"feedback": "boring"}]'

def create_version(slug: str, content: str) -> str:
    import json
    subjects[slug] = {"body": content, "version": subjects.get(slug, {}).get("version", 0) + 1}
    return json.dumps({"version": subjects[slug]["version"]})

def create_experiment(prompt_id: str, baseline_version_id: str, new_version_id: str, weight: str = "10") -> str:
    return '{"experiment_id": "exp-1", "status": "running"}'

def conclude_experiment(experiment_id: str) -> str:
    return '{"status": "concluded"}'

tools = [
    ToolSpec(name="get_prompt", description="Get prompt by slug", callable=get_prompt),
    ToolSpec(name="get_metrics", description="Get metrics", callable=get_metrics),
    ToolSpec(name="get_sample_interactions", description="Get samples", callable=get_sample_interactions),
    ToolSpec(name="create_version", description="Create new version", callable=create_version),
    ToolSpec(name="create_experiment", description="Create experiment", callable=create_experiment),
    ToolSpec(name="conclude_experiment", description="Conclude experiment", callable=conclude_experiment),
]

agent = AutoResearcherAgent()
result = await agent.run(
    input="Optimize 'welcome' subject line. Quality is 5.3/10, target 7.0+.",
    llm_config=LLMConfig(provider="groq", model="openai/gpt-oss-120b", api_key="..."),
    tools=tools,
)
print(subjects["welcome"]["body"])  # improved version

With autoresearch-prompt-manager

When used with the full prompt-manager stack, the tools wrap the API instead of local data:

arpm-api up && arpm-api start   # start the API
arpm-example loop                # run the optimization loop


## Safety rails

The `AutoResearcherAgent` includes a built-in `check_safety_rails` tool that validates:

- Similarity to original (>= 30%)
- Non-empty content (> 10 chars)
- Within iteration budget
- Reasonable length (30%-300% of original)

## Configuration

### LLM settings

The autoresearcher receives LLM config at runtime. When used with autoresearch-prompt-manager, set:

```bash
export PM_LLM_PROVIDER=groq              # or: anthropic, openai, gemini, openrouter
export PM_LLM_MODEL=openai/gpt-oss-120b  # model ID
export PM_LLM_API_KEY=your-api-key       # provider API key

Optimization settings

from autoresearcher_shonku import AutoResearcherConfig

config = AutoResearcherConfig(
    max_iterations=10,
    improvement_threshold=0.01,
    max_edit_distance=0.5,
    canary_weight=5.0,
    rollback_on_regression=True,
)

Acknowledgements

Optimization loop inspired by Karpathy's autoresearch
Agent execution powered by agno and AgentOS

Part of autoresearch-prompt-manager

autoresearch-prompt-manager  (prompt CRUD, experiments, metrics)
  -> autoresearcher-shonku   (this package -- optimization agents)
  -> shonku                  (agent framework)
  -> agno                    (runtime -- https://agno.com)

Install via the parent package: pip install autoresearch-prompt-manager[autoresearcher]

Contributing

Fork autoresearch-prompt-manager
cd packages/autoresearcher_shonku && pip install -e '.[dev]'
Make changes, pytest, ruff check src/
Submit a PR

To add new optimization strategies, create a new agent in agents/ following the ShonkuAgent pattern.

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.2

Mar 25, 2026

0.1.1

Mar 25, 2026

0.1.0

Mar 25, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autoresearcher_shonku-0.1.2.tar.gz (11.4 kB view details)

Uploaded Mar 25, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

autoresearcher_shonku-0.1.2-py3-none-any.whl (15.3 kB view details)

Uploaded Mar 25, 2026 Python 3

File details

Details for the file autoresearcher_shonku-0.1.2.tar.gz.

File metadata

Download URL: autoresearcher_shonku-0.1.2.tar.gz
Upload date: Mar 25, 2026
Size: 11.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for autoresearcher_shonku-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`8e7bcb4d3b2e63947e1f9eb3d9e0facc580bc7a9a73532c51d76f336f6fc8e0e`
MD5	`f7fbfbcf4a46a0818b070ef6c431322d`
BLAKE2b-256	`5543d5292cc2ea5a15e2b9f2681e54cbadda34745f09ed49f054bf17d93ac624`

See more details on using hashes here.

File details

Details for the file autoresearcher_shonku-0.1.2-py3-none-any.whl.

File metadata

Download URL: autoresearcher_shonku-0.1.2-py3-none-any.whl
Upload date: Mar 25, 2026
Size: 15.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for autoresearcher_shonku-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c4f32e786f37aebcabd9ca4175cec47cf1b6295cc83cedcb61a15d565cbedb46`
MD5	`db11437b46aabff296aa6fd512ceda24`
BLAKE2b-256	`86144f4f6d4d82d2416a5e6dab72aad7cf9a2655e0f327e871ed5647cb5d0155`

See more details on using hashes here.

autoresearcher-shonku 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Autoresearcher Shonku

Install

How it works

Agents

Usage

Example: optimize email subject lines stored in a CSV

With autoresearch-prompt-manager

Optimization settings

Acknowledgements

Part of autoresearch-prompt-manager

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes