Skip to main content

Parallel LLM scout + confidence-gated synthesis loop for GitHub-grounded research

Project description

bentoloop

Parallel LLM scout + confidence-gated synthesis loop for GitHub-grounded research.

pip install bentoloop

Bentoloop runs N Claude scouts in parallel against real GitHub data, synthesizes their findings with Claude Opus, checks a confidence score, and loops until the answer is good enough — or you hit your round limit.

from bentoloop import bento_loop

result = bento_loop(
    scouts=[
        {
            "id": "s1_bugs",
            "description": "Find the most-discussed open bugs in qdrant.",
            "fetch_spec": {
                "gh_search_issues": [
                    {"q": "repo:qdrant/qdrant is:issue is:open label:bug sort:comments-desc", "per_page": 8}
                ]
            },
        },
        {
            "id": "s2_changelog",
            "description": "Check qdrant CHANGELOG for recent stability fixes.",
            "fetch_spec": {
                "gh_file_contents": [
                    {"repo": "qdrant/qdrant", "path": "CHANGELOG.md", "ref": "master"}
                ]
            },
        },
    ],
    synthesis_prompt=(
        "Based on the scout reports, assess qdrant's current production stability. "
        "What are the top 3 open risks? "
        "End with: Confidence Score: N/100 and Loop Decision: PROCEED or LOOP AGAIN."
    ),
    threshold=85,
    output_dir="/tmp/bentoloop_run",
)

print(f"Confidence: {result['confidence']}/100 in {result['rounds']} round(s)")
print(result["synthesis"])

Why bentoloop?

Problem bentoloop solution
LLMs hallucinate PR numbers and file contents Scouts fetch real GitHub data before the LLM sees it
Single-pass synthesis misses depth Confidence-gated loop: re-runs until score threshold met
Research is sequential Scouts run in parallel via ThreadPoolExecutor
Hard to reproduce findings All artifacts written to disk — reproducible research tree

Install

pip install bentoloop

Requires Python 3.10+.

Set environment variables:

export ANTHROPIC_API_KEY=your-anthropic-key
export GITHUB_TOKEN=your-github-token   # optional but strongly recommended for rate limits

Quickstart

See examples/minimal.py for a runnable 20-line example.


Core concepts

Scout

A scout is a dict with three fields:

{
    "id": "s1",                        # unique id for logging + artifact naming
    "description": "What to research", # injected into the LLM prompt
    "fetch_spec": { ... }              # what real GitHub data to fetch first
}

FetchSpec

Controls what GitHub data is fetched before the LLM sees anything:

fetch_spec = {
    # Search issues / PRs via GitHub Search API
    "gh_search_issues": [
        {"q": "repo:org/repo is:issue label:bug", "per_page": 10}
    ],

    # Read a file at a specific ref
    "gh_file_contents": [
        {"repo": "org/repo", "path": "CHANGELOG.md", "ref": "main"}
    ],

    # List issues
    "gh_list_issues": [
        {"repo": "org/repo", "state": "open"}
    ],

    # Fetch comments on a specific issue
    "gh_issue_comments": [
        {"repo": "org/repo", "issue_number": 42}
    ],
}

All keys are optional. Combine freely.

Confidence scoring

Your synthesis_prompt must instruct the model to output a confidence score:

End your response with exactly:
Confidence Score: N/100
Loop Decision: PROCEED or LOOP AGAIN

Bentoloop extracts the integer N and loops again if N < threshold.

Output directory

When output_dir is set, bentoloop writes:

output_dir/
├── round1/
│   ├── s1/artifact.md
│   └── s2/artifact.md
├── round2/
│   └── ...
├── synthesis_loop1.md
└── synthesis_loop2.md

API Reference

bento_loop

def bento_loop(
    scouts: list[Scout],
    synthesis_prompt: str,
    threshold: int = 95,
    max_rounds: int = 5,
    context: str = "",
    output_dir: str | Path | None = None,
    scout_model: str | None = None,
    synthesis_model: str | None = None,
) -> BentoResult:
Parameter Type Default Description
scouts list[Scout] Scout definitions
synthesis_prompt str Opus synthesis prompt. Must request Confidence Score: N/100.
threshold int 95 Stop looping when confidence ≥ threshold
max_rounds int 5 Hard limit on iterations
context str "" Background context injected into every scout
output_dir str | Path | None None Write artifacts here; skip if None
scout_model str | None None Override scout model (env BENTO_MODEL)
synthesis_model str | None None Override synthesis model (env SYNTHESIS_MODEL)

Returns a BentoResult TypedDict:

class BentoResult(TypedDict):
    confidence: int          # final confidence score (0–100)
    synthesis: str           # final synthesis text
    rounds: int              # number of rounds completed
    artifacts: dict[str, str] # scout_id → artifact text (last round)
    output_dir: str          # resolved output directory path

run_scouts

def run_scouts(
    scouts: list[Scout],
    context: str = "",
    output_dir: Path | None = None,
    max_workers: int = 4,
) -> dict[str, str]:

Run one set of scouts in parallel. Returns scout_id → artifact.

synthesize

def synthesize(
    artifacts: dict[str, str],
    synthesis_prompt: str,
    loop_num: int = 1,
    output_dir: Path | None = None,
) -> tuple[str, int]:

Run Opus synthesis over artifacts. Returns (synthesis_text, confidence_score).


Environment variables

Variable Default Description
ANTHROPIC_API_KEY Required. Your Anthropic API key.
GITHUB_TOKEN Strongly recommended. Raises GitHub rate limit from 60 to 5000 req/hr.
BENTO_MODEL claude-sonnet-4-6 Scout model override.
SYNTHESIS_MODEL claude-opus-4-6 Synthesis model override.

Privacy and data handling

Bentoloop is privacy-safe by design. There is nothing to comply with because no personal data is collected or transmitted.

Claim How it is enforced
Only public GitHub data is accessed All fetchers call the public GitHub API. No private repo access.
No telemetry Zero calls to any analytics endpoint. No usage tracking.
No central server The package runs entirely on your machine.
API keys never leave your machine Keys are read from env vars, passed to Anthropic/GitHub in HTTPS request headers, and never logged or stored.
No personal data processed GitHub issue/PR data is public. We do not process user profile data, emails, or private content.
Output is local-only All artifacts are written to your output_dir. Nothing is uploaded.

GDPR / CCPA status: There is no personal data controller relationship. No data subject rights apply because no personal data is collected, stored, or processed. No consent mechanism is required. If your application uses bentoloop and does collect personal data (e.g. you pipe user-submitted queries through it), your application is the data controller and you should document that separately.


Contributing

See CONTRIBUTING.md.


License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bentoloop-0.2.0.tar.gz (18.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bentoloop-0.2.0-py3-none-any.whl (15.5 kB view details)

Uploaded Python 3

File details

Details for the file bentoloop-0.2.0.tar.gz.

File metadata

  • Download URL: bentoloop-0.2.0.tar.gz
  • Upload date:
  • Size: 18.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for bentoloop-0.2.0.tar.gz
Algorithm Hash digest
SHA256 492af384305dcab4453dda43a2245a15a59a1f87a1cc6d53f871e2ebb607ebde
MD5 acc270158e75e796077387ba6abdaea4
BLAKE2b-256 5b8b9ca42927f3672f221646949d791dd4c5fd5becd3df0f2f5ff86dc3de54ac

See more details on using hashes here.

File details

Details for the file bentoloop-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: bentoloop-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 15.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for bentoloop-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a40db25c55cacaa7355972579c254443b56538f5d948cbd650fda23b215fd322
MD5 4fffa7fd29afb2d78ebcaadf6ba313cf
BLAKE2b-256 be7a43023e5fad7bf2b1b87605cc0ca7a768ce9ccf9b52703b4a8f98077b80d4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page