Parallel LLM scout + confidence-gated synthesis loop for GitHub-grounded research
Project description
bentoloop
Parallel LLM scout + confidence-gated synthesis loop for GitHub-grounded research.
pip install bentoloop
Bentoloop runs N Claude scouts in parallel against real GitHub data, synthesizes their findings with Claude Opus, checks a confidence score, and loops until the answer is good enough — or you hit your round limit.
from bentoloop import bento_loop
result = bento_loop(
scouts=[
{
"id": "s1_bugs",
"description": "Find the most-discussed open bugs in qdrant.",
"fetch_spec": {
"gh_search_issues": [
{"q": "repo:qdrant/qdrant is:issue is:open label:bug sort:comments-desc", "per_page": 8}
]
},
},
{
"id": "s2_changelog",
"description": "Check qdrant CHANGELOG for recent stability fixes.",
"fetch_spec": {
"gh_file_contents": [
{"repo": "qdrant/qdrant", "path": "CHANGELOG.md", "ref": "master"}
]
},
},
],
synthesis_prompt=(
"Based on the scout reports, assess qdrant's current production stability. "
"What are the top 3 open risks? "
"End with: Confidence Score: N/100 and Loop Decision: PROCEED or LOOP AGAIN."
),
threshold=85,
output_dir="/tmp/bentoloop_run",
)
print(f"Confidence: {result['confidence']}/100 in {result['rounds']} round(s)")
print(result["synthesis"])
Why bentoloop?
| Problem | bentoloop solution |
|---|---|
| LLMs hallucinate PR numbers and file contents | Scouts fetch real GitHub data before the LLM sees it |
| Single-pass synthesis misses depth | Confidence-gated loop: re-runs until score threshold met |
| Research is sequential | Scouts run in parallel via ThreadPoolExecutor |
| Hard to reproduce findings | All artifacts written to disk — reproducible research tree |
Install
pip install bentoloop
Requires Python 3.10+.
Set environment variables:
export ANTHROPIC_API_KEY=your-anthropic-key
export GITHUB_TOKEN=your-github-token # optional but strongly recommended for rate limits
Quickstart
See examples/minimal.py for a runnable 20-line example.
Core concepts
Scout
A scout is a dict with three fields:
{
"id": "s1", # unique id for logging + artifact naming
"description": "What to research", # injected into the LLM prompt
"fetch_spec": { ... } # what real GitHub data to fetch first
}
FetchSpec
Controls what GitHub data is fetched before the LLM sees anything:
fetch_spec = {
# Search issues / PRs via GitHub Search API
"gh_search_issues": [
{"q": "repo:org/repo is:issue label:bug", "per_page": 10}
],
# Read a file at a specific ref
"gh_file_contents": [
{"repo": "org/repo", "path": "CHANGELOG.md", "ref": "main"}
],
# List issues
"gh_list_issues": [
{"repo": "org/repo", "state": "open"}
],
# Fetch comments on a specific issue
"gh_issue_comments": [
{"repo": "org/repo", "issue_number": 42}
],
}
All keys are optional. Combine freely.
Confidence scoring
Your synthesis_prompt must instruct the model to output a confidence score:
End your response with exactly:
Confidence Score: N/100
Loop Decision: PROCEED or LOOP AGAIN
Bentoloop extracts the integer N and loops again if N < threshold.
Output directory
When output_dir is set, bentoloop writes:
output_dir/
├── round1/
│ ├── s1/artifact.md
│ └── s2/artifact.md
├── round2/
│ └── ...
├── synthesis_loop1.md
└── synthesis_loop2.md
API Reference
bento_loop
def bento_loop(
scouts: list[Scout],
synthesis_prompt: str,
threshold: int = 95,
max_rounds: int = 5,
context: str = "",
output_dir: str | Path | None = None,
scout_model: str | None = None,
synthesis_model: str | None = None,
) -> BentoResult:
| Parameter | Type | Default | Description |
|---|---|---|---|
scouts |
list[Scout] |
— | Scout definitions |
synthesis_prompt |
str |
— | Opus synthesis prompt. Must request Confidence Score: N/100. |
threshold |
int |
95 |
Stop looping when confidence ≥ threshold |
max_rounds |
int |
5 |
Hard limit on iterations |
context |
str |
"" |
Background context injected into every scout |
output_dir |
str | Path | None |
None |
Write artifacts here; skip if None |
scout_model |
str | None |
None |
Override scout model (env BENTO_MODEL) |
synthesis_model |
str | None |
None |
Override synthesis model (env SYNTHESIS_MODEL) |
Returns a BentoResult TypedDict:
class BentoResult(TypedDict):
confidence: int # final confidence score (0–100)
synthesis: str # final synthesis text
rounds: int # number of rounds completed
artifacts: dict[str, str] # scout_id → artifact text (last round)
output_dir: str # resolved output directory path
run_scouts
def run_scouts(
scouts: list[Scout],
context: str = "",
output_dir: Path | None = None,
max_workers: int = 4,
) -> dict[str, str]:
Run one set of scouts in parallel. Returns scout_id → artifact.
synthesize
def synthesize(
artifacts: dict[str, str],
synthesis_prompt: str,
loop_num: int = 1,
output_dir: Path | None = None,
) -> tuple[str, int]:
Run Opus synthesis over artifacts. Returns (synthesis_text, confidence_score).
Environment variables
| Variable | Default | Description |
|---|---|---|
ANTHROPIC_API_KEY |
— | Required. Your Anthropic API key. |
GITHUB_TOKEN |
— | Strongly recommended. Raises GitHub rate limit from 60 to 5000 req/hr. |
BENTO_MODEL |
claude-sonnet-4-6 |
Scout model override. |
SYNTHESIS_MODEL |
claude-opus-4-6 |
Synthesis model override. |
Privacy and data handling
Bentoloop is privacy-safe by design. There is nothing to comply with because no personal data is collected or transmitted.
| Claim | How it is enforced |
|---|---|
| Only public GitHub data is accessed | All fetchers call the public GitHub API. No private repo access. |
| No telemetry | Zero calls to any analytics endpoint. No usage tracking. |
| No central server | The package runs entirely on your machine. |
| API keys never leave your machine | Keys are read from env vars, passed to Anthropic/GitHub in HTTPS request headers, and never logged or stored. |
| No personal data processed | GitHub issue/PR data is public. We do not process user profile data, emails, or private content. |
| Output is local-only | All artifacts are written to your output_dir. Nothing is uploaded. |
GDPR / CCPA status: There is no personal data controller relationship. No data subject rights apply because no personal data is collected, stored, or processed. No consent mechanism is required. If your application uses bentoloop and does collect personal data (e.g. you pipe user-submitted queries through it), your application is the data controller and you should document that separately.
Contributing
See CONTRIBUTING.md.
License
MIT — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file bentoloop-0.1.0.tar.gz.
File metadata
- Download URL: bentoloop-0.1.0.tar.gz
- Upload date:
- Size: 15.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
348ebaa254d9139c3180b0eeb3645babcbca8a8c9bf3ef2f893342dc6eccd1bd
|
|
| MD5 |
59b570a87e2f1a39c371838cb518cfaf
|
|
| BLAKE2b-256 |
f8836b4b3e6afa92a08a7369d09adf1fde144797ada7e9991dd11b2928b6f028
|
File details
Details for the file bentoloop-0.1.0-py3-none-any.whl.
File metadata
- Download URL: bentoloop-0.1.0-py3-none-any.whl
- Upload date:
- Size: 14.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3ce28302e2296c00e67ec51efb89bc16683610e143df96e5e7bdc6b29f9137cc
|
|
| MD5 |
36c07efaefaa90d9200ea938f9a18e5c
|
|
| BLAKE2b-256 |
d0819f3934efc95690a07520314fc4e0ea81385fa3d824a68973d5a80de20786
|