Multi-agent research system powered by Blackgeorge
Project description
Shandu - Multi-Agent AI Research CLI and GUI
Shandu is a Python deep-research agent that plans research loops, searches the web, scrapes webpages and documents, extracts evidence, and writes citation-backed reports. It is powered by Blackgeorge and works from both a terminal CLI and a Gradio control room.
- Architecture deep dive:
ARCH.md - Example long-form output: see the
examplesdirectory. - DeepSeek Flash example:
examples/deepseek-flash.md
Architecture
- Lead orchestrator plans iterative research loops.
- Parallel search subagents retrieve and extract web evidence.
- Citation subagent builds the final reference ledger.
- SQLite-backed memory tracks run context across steps.
- Rich CLI control deck renders run metrics and timeline.
- Gradio GUI control room provides live telemetry, task views, and report download.
- Scraper pipeline normalizes URLs, strips boilerplate HTML, and favors main-content blocks.
Installation
Recommended for end users (no manual venv management):
pipx install shandu
Standard pip install:
pip install shandu
Install latest from GitHub:
pipx install "git+https://github.com/jolovicdev/shandu.git@main"
Quick Start
uv sync --dev
source .venv/bin/activate
cp .env.example .env
# edit .env with your provider/model settings
API Key Configuration (LiteLLM Style)
shandu configure now asks for:
Default model(example:deepseek/deepseek-v4-flash,openrouter/minimax/minimax-m2.5)API key env var name(example:DEEPSEEK_API_KEY,OPENROUTER_API_KEY,ANYSUPPORTED_API_KEY)API key value(hidden input)
Shandu saves these in user config storage and exports the configured env var at runtime for LiteLLM if it is not already set in your shell.
Examples:
# DeepSeek
shandu configure
# model: deepseek/deepseek-v4-flash
# env var name: DEEPSEEK_API_KEY
# key value: <your key>
# OpenRouter
shandu configure
# model: openrouter/minimax/minimax-m2.5
# env var name: OPENROUTER_API_KEY
# key value: <your key>
You can still configure keys only through shell env vars if you prefer:
export OPENROUTER_API_KEY="your_real_key"
Environment Variables (Without shandu configure)
If you prefer not to use interactive configuration, set env vars directly.
Provider/model:
SHANDU_MODEL(primary model selector, exampledeepseek/deepseek-v4-flash)OPENAI_MODEL_NAME(compatibility fallback ifSHANDU_MODELis not set)
Provider API key routing:
SHANDU_API_KEY_ENV(name of provider key env var, exampleOPENROUTER_API_KEY)SHANDU_API_KEY(actual key value that Shandu exports intoSHANDU_API_KEY_ENVat runtime if missing)
Direct LiteLLM-style provider key env vars (examples):
DEEPSEEK_API_KEYOPENROUTER_API_KEYANTHROPIC_API_KEYOPENAI_API_KEY- Any other provider key name LiteLLM supports, for example
ANYSUPPORTED_API_KEY
Generation/runtime controls:
SHANDU_TEMPERATURE(default0.2)SHANDU_MAX_TOKENS(default16384)SHANDU_STORAGE_DIR(default.blackgeorge)SHANDU_PROXY(optional proxy for scraping)
Precedence:
- If your provider key env var (for example
OPENROUTER_API_KEY) is already set in shell, Shandu uses it. - Otherwise, Shandu uses
SHANDU_API_KEY_ENV+SHANDU_API_KEYfrom config/env.
CLI
shandu run "Who is the current president of the United States?" \
--max-iterations 1 \
--parallelism 2 \
--max-results-per-query 2 \
--max-pages-per-task 2 \
--output report.md
--parallelism controls the maximum number of subagent tasks that execute concurrently inside each iteration. If set to 2, the lead planner creates at least two independent tasks when possible, and the orchestrator runs up to two tasks at the same time.
During shandu run, progress events stream live in the terminal:
BOOTSTRAP/PLAN/SEARCH/SYNTHESIZE/CITE/REPORT/COMPLETE- Per-task search events (
Task <id> startedandTask <id> completed) with metrics - Iteration index and task IDs for long-running model calls
- Run summary includes model call count across lead/subagents/citation
- Metered calls/tokens/cost appear when provider exposes billing/usage metrics
shandu aisearch "latest state of open-source browser automation in 2026" \
--max-results 8 \
--max-pages 3 \
--detail-level high \
--output aisearch.md
aisearch returns classic behavior: web search + synthesized explanation with source citations.
Citation behavior:
- Final reports enforce numeric citation markers (
[1],[2], ...). - Raw internal evidence IDs are removed from the rendered markdown.
- The final
## Referencessection is rendered from the citation ledger to keep numbering stable.
Other commands:
shandu infoshandu configureshandu guishandu aisearch <query>shandu inspect <run_id>shandu clean
GUI
Launch the visual control room:
shandu gui --host 127.0.0.1 --port 7860
gradio ships with the default Shandu install, so shandu gui works out of the box.
GUI features:
- live run stage timeline (
BOOTSTRAPthroughCOMPLETE) - per-subagent task board (status, focus, last query, evidence)
- search/scrape trace stream (query start/finish, hit counts, URLs scraped, extraction/fallback signals)
- final report + citation ledger panels
- one-click markdown download button after run completion
- run cost display (
usd_spent) when provider exposes cost metrics - runtime configuration editing (model, provider env var name, key, iteration/parallelism/search limits)
GUI Preview
Main Screen
Tables View
Report View
Python API
from shandu import ResearchRequest, ShanduEngine
engine = ShanduEngine.from_config()
result = engine.run_sync(
ResearchRequest(
query="AI inference infrastructure 2026",
max_iterations=2,
parallelism=3,
)
)
print(result.report_markdown)
Development
uv run ruff check .
uv run pytest -q
Scraper Notes
- Three-layer HTML extraction: trafilatura → readability-lxml → BS4.
- Document-format support: PDF, DOCX, XLSX, CSV, plaintext, markdown.
- Structured blocks preserve headings, tables, code, blockquotes, and list items.
- Per-domain rate limiting with exponential backoff; 3 retry attempts with jitter.
- Fetch-error detection: paywall, captcha, empty JS shell, blocked, login-required.
- Publication-date extraction from OpenGraph, JSON-LD, DC, prism, sailthru, parsely meta tags.
- In-flight deduplication prevents concurrent duplicate fetches of the same URL.
- Redirect-aware: pages tracked by requested URL so redirects don't cause false misses.
Upcoming: EVEN STRONGER source-quality enforcement will flag weak/undated/advocacy sources (blog posts, linkedin etc.) so the synthesizer can distinguish strong primary evidence from low-signal pages.
MIT license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file shandu-3.0.8.tar.gz.
File metadata
- Download URL: shandu-3.0.8.tar.gz
- Upload date:
- Size: 1.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f7099701590eeb7adc64235ef673a90487b9beb32fa5dd08dd9af6f5b43322fb
|
|
| MD5 |
904127c04db89bec4db54cb52d083349
|
|
| BLAKE2b-256 |
72d050123ad9a6377ab4ceceb78e84af61e4c90555bcc20f83f7296a5b71e884
|
File details
Details for the file shandu-3.0.8-py3-none-any.whl.
File metadata
- Download URL: shandu-3.0.8-py3-none-any.whl
- Upload date:
- Size: 65.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
158c21bdf75cfb1fda14d47b4ba4e2514dbec1ca5d638f4d5c1cc7e1ae469e95
|
|
| MD5 |
4d4c289eec50a940ec8b8b150daadaff
|
|
| BLAKE2b-256 |
326b519526c1331c83a083b506843d8dc750893e78e7aefb59cdc4afc3c72a58
|