Skip to main content

Parallel LLM scout + confidence-gated synthesis loop for GitHub-grounded research

Project description

bentoloop

Parallel LLM scout + confidence-gated synthesis loop for GitHub-grounded research.

pip install bentoloop

Bentoloop runs N Claude scouts in parallel against real GitHub data, synthesizes their findings with Claude Opus, checks a confidence score, and loops until the answer is good enough — or you hit your round limit.

from bentoloop import bento_loop

result = bento_loop(
    scouts=[
        {
            "id": "s1_bugs",
            "description": "Find the most-discussed open bugs in qdrant.",
            "fetch_spec": {
                "gh_search_issues": [
                    {"q": "repo:qdrant/qdrant is:issue is:open label:bug sort:comments-desc", "per_page": 8}
                ]
            },
        },
        {
            "id": "s2_changelog",
            "description": "Check qdrant CHANGELOG for recent stability fixes.",
            "fetch_spec": {
                "gh_file_contents": [
                    {"repo": "qdrant/qdrant", "path": "CHANGELOG.md", "ref": "master"}
                ]
            },
        },
    ],
    synthesis_prompt=(
        "Based on the scout reports, assess qdrant's current production stability. "
        "What are the top 3 open risks? "
        "End with: Confidence Score: N/100 and Loop Decision: PROCEED or LOOP AGAIN."
    ),
    threshold=85,
    output_dir="/tmp/bentoloop_run",
)

print(f"Confidence: {result['confidence']}/100 in {result['rounds']} round(s)")
print(result["synthesis"])

Why bentoloop?

Problem bentoloop solution
LLMs hallucinate PR numbers and file contents Scouts fetch real GitHub data before the LLM sees it
Single-pass synthesis misses depth Confidence-gated loop: re-runs until score threshold met
Research is sequential Scouts run in parallel via ThreadPoolExecutor
Hard to reproduce findings All artifacts written to disk — reproducible research tree

Install

pip install bentoloop

Requires Python 3.10+.

Set environment variables:

export ANTHROPIC_API_KEY=your-anthropic-key
export GITHUB_TOKEN=your-github-token   # optional but strongly recommended for rate limits

Quickstart

See examples/minimal.py for a runnable 20-line example.


Core concepts

Scout

A scout is a dict with three fields:

{
    "id": "s1",                        # unique id for logging + artifact naming
    "description": "What to research", # injected into the LLM prompt
    "fetch_spec": { ... }              # what real GitHub data to fetch first
}

FetchSpec

Controls what GitHub data is fetched before the LLM sees anything:

fetch_spec = {
    # Search issues / PRs via GitHub Search API
    "gh_search_issues": [
        {"q": "repo:org/repo is:issue label:bug", "per_page": 10}
    ],

    # Read a file at a specific ref
    "gh_file_contents": [
        {"repo": "org/repo", "path": "CHANGELOG.md", "ref": "main"}
    ],

    # List issues
    "gh_list_issues": [
        {"repo": "org/repo", "state": "open"}
    ],

    # Fetch comments on a specific issue
    "gh_issue_comments": [
        {"repo": "org/repo", "issue_number": 42}
    ],
}

All keys are optional. Combine freely.

Confidence scoring

Your synthesis_prompt must instruct the model to output a confidence score:

End your response with exactly:
Confidence Score: N/100
Loop Decision: PROCEED or LOOP AGAIN

Bentoloop extracts the integer N and loops again if N < threshold.

Output directory

When output_dir is set, bentoloop writes:

output_dir/
├── round1/
│   ├── s1/artifact.md
│   └── s2/artifact.md
├── round2/
│   └── ...
├── synthesis_loop1.md
└── synthesis_loop2.md

API Reference

bento_loop

def bento_loop(
    scouts: list[Scout],
    synthesis_prompt: str,
    threshold: int = 95,
    max_rounds: int = 5,
    context: str = "",
    output_dir: str | Path | None = None,
    scout_model: str | None = None,
    synthesis_model: str | None = None,
) -> BentoResult:
Parameter Type Default Description
scouts list[Scout] Scout definitions
synthesis_prompt str Opus synthesis prompt. Must request Confidence Score: N/100.
threshold int 95 Stop looping when confidence ≥ threshold
max_rounds int 5 Hard limit on iterations
context str "" Background context injected into every scout
output_dir str | Path | None None Write artifacts here; skip if None
scout_model str | None None Override scout model (env BENTO_MODEL)
synthesis_model str | None None Override synthesis model (env SYNTHESIS_MODEL)

Returns a BentoResult TypedDict:

class BentoResult(TypedDict):
    confidence: int          # final confidence score (0–100)
    synthesis: str           # final synthesis text
    rounds: int              # number of rounds completed
    artifacts: dict[str, str] # scout_id → artifact text (last round)
    output_dir: str          # resolved output directory path

run_scouts

def run_scouts(
    scouts: list[Scout],
    context: str = "",
    output_dir: Path | None = None,
    max_workers: int = 4,
) -> dict[str, str]:

Run one set of scouts in parallel. Returns scout_id → artifact.

synthesize

def synthesize(
    artifacts: dict[str, str],
    synthesis_prompt: str,
    loop_num: int = 1,
    output_dir: Path | None = None,
) -> tuple[str, int]:

Run Opus synthesis over artifacts. Returns (synthesis_text, confidence_score).


Environment variables

Variable Default Description
ANTHROPIC_API_KEY Required. Your Anthropic API key.
GITHUB_TOKEN Strongly recommended. Raises GitHub rate limit from 60 to 5000 req/hr.
BENTO_MODEL claude-sonnet-4-6 Scout model override.
SYNTHESIS_MODEL claude-opus-4-6 Synthesis model override.

Privacy and data handling

Bentoloop is privacy-safe by design. There is nothing to comply with because no personal data is collected or transmitted.

Claim How it is enforced
Only public GitHub data is accessed All fetchers call the public GitHub API. No private repo access.
No telemetry Zero calls to any analytics endpoint. No usage tracking.
No central server The package runs entirely on your machine.
API keys never leave your machine Keys are read from env vars, passed to Anthropic/GitHub in HTTPS request headers, and never logged or stored.
No personal data processed GitHub issue/PR data is public. We do not process user profile data, emails, or private content.
Output is local-only All artifacts are written to your output_dir. Nothing is uploaded.

GDPR / CCPA status: There is no personal data controller relationship. No data subject rights apply because no personal data is collected, stored, or processed. No consent mechanism is required. If your application uses bentoloop and does collect personal data (e.g. you pipe user-submitted queries through it), your application is the data controller and you should document that separately.


Contributing

See CONTRIBUTING.md.


License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bentoloop-0.1.0.tar.gz (15.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bentoloop-0.1.0-py3-none-any.whl (14.4 kB view details)

Uploaded Python 3

File details

Details for the file bentoloop-0.1.0.tar.gz.

File metadata

  • Download URL: bentoloop-0.1.0.tar.gz
  • Upload date:
  • Size: 15.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for bentoloop-0.1.0.tar.gz
Algorithm Hash digest
SHA256 348ebaa254d9139c3180b0eeb3645babcbca8a8c9bf3ef2f893342dc6eccd1bd
MD5 59b570a87e2f1a39c371838cb518cfaf
BLAKE2b-256 f8836b4b3e6afa92a08a7369d09adf1fde144797ada7e9991dd11b2928b6f028

See more details on using hashes here.

File details

Details for the file bentoloop-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: bentoloop-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 14.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for bentoloop-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3ce28302e2296c00e67ec51efb89bc16683610e143df96e5e7bdc6b29f9137cc
MD5 36c07efaefaa90d9200ea938f9a18e5c
BLAKE2b-256 d0819f3934efc95690a07520314fc4e0ea81385fa3d824a68973d5a80de20786

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page