Skip to main content

Add your description here

Project description

Multi-Agent Literature Review Pipeline

A multi-agent literature review pipeline built with the Google ADK (Agent Development Kit). It coordinates specialized agents to iteratively research sub-topics based on a YAML configuration. For each topic, it searches from academic and practitioner perspectives, evaluates the work through a peer review ensemble, and synthesizes a well-grounded report.

Built as a capstone submission for the Kaggle "AI Agents: Intensive Vibe Coding Capstone".

Motivation

Literature reviews are suffocating: hundreds of papers, conflicting claims, and no obvious signal in the noise. Andrej Karpathy's LLM Council showed that multi-agent debate surfaces sharper answers than a single prompt. This project takes that insight into the research domain.

LLMs researching complex topics suffer from two problems: lack of diverse grounding and self-preference bias (favoring their own outputs).

This pipeline addresses both:

  1. Tiered Orchestration: A Planner agent splits configured research topics into a multi-wave execution graph. Foundational concepts (Wave 1) run in parallel, and synthesis-dependent topics (Wave 2) run sequentially with distilled context from Wave 1.
  2. Source Isolation: Two independent tracks per topic, each with its own explorer (search) and reporter (write) agent. The academic track searches ArXiv, OpenAlex, and scholarly publishers. The practitioner track searches GitHub and engineering docs.
  3. Peer Review Ensemble: Three reviewers (Researcher, Engineer, Architect) evaluate anonymized reports. Borda-count voting aggregates rankings so no single reviewer dominates.
  4. Anti-Hallucination Guardrails: The Synthesis agent's output is parsed and validated. Dangling citations like (Author, Year) or [1] are rejected. Every URL in the final report must exist in the original source references, or the run is retried (up to 2 times). A blog-tier ratio check warns when over 50% of sources are blog/forum tier.

Architecture

Pipeline Architecture

The pipeline is organized into four stages, with topics executed across two waves to balance parallelism and sequential dependency.

Why Two Waves?

Not all research topics are independent. Some topics (e.g., foundational concepts like "truth maintenance systems") can be researched in parallel, while others (e.g., "multi-agent coordination using TMS") depend on the synthesized understanding of earlier topics.

The Planner agent reads topics.yaml and partitions topics into:

  • Wave 1 — parallel, independent topics. All topics in this wave run simultaneously through the full Stage 1→2→3 pipeline.
  • Wave 2 — sequential, dependent topics. These topics require the distilled context from Wave 1 before they can be researched accurately.

Wave Handoff via the Distiller

After Wave 1 completes, the Distiller agent consumes the Wave 1 topic files and produces a compact summary of the foundational findings. This distilled context is injected into every Wave 2 topic's prompt as additional background, ensuring Wave 2 explorers and reporters build on top of verified Wave 1 conclusions rather than starting from scratch.

This prevents redundant searches and improves coherence across the final OKF bundle.

Stage Breakdown

Stage 0 (Orchestration)
├── Planner agent organizes YAML topics into Wave 1 (parallel) and Wave 2 (sequential)
└── Distiller agent summarizes completed Wave 1 topics to provide prior context to Wave 2

Stage 1 (Parallel Fan-out per Topic)
├── Academic Track (SequentialAgent)
│   ├── academic_explorer  → searches ArXiv, OpenAlex, Tavily (scholarly domains)
│   └── academic_reporter  → writes Researcher report with structured references
└── Practitioner Track (SequentialAgent)
    ├── practitioner_explorer → searches GitHub, Tavily (engineering domains)
    └── practitioner_reporter → writes Engineer report with structured references

Stage 2 (Peer Review Ensemble per Topic)
├── researcher_reviewer  → ranks anonymized reports (Researcher perspective)
├── engineer_reviewer    → ranks anonymized reports (Engineer perspective)
└── architect_reviewer   → ranks anonymized reports (Architect perspective)
    → Borda-count tally → winning report selected

Stage 3 (Synthesis & Persistence)
├── synthesis agent → condensed final brief with YAML frontmatter
│   → citation validation loop (rejects hallucinated/dangling URLs, retries up to 2x)
└── Writes out to an interconnected Markdown OKF bundle (index.md and topic files)

Search Providers

Provider Domains Used By
ArXiv API arxiv.org Academic explorer
OpenAlex API openalex.org Academic explorer
Tavily (scholarly) acm.org, ieee.org, springer.com, sciencedirect.com, nature.com, science.org, wiley.com Academic explorer
GitHub API github.com Practitioner explorer
Tavily (engineering) github.com, docs.microsoft.com, aws.amazon.com, cloud.google.com, medium.com, dev.to Practitioner explorer

All providers use tenacity retry with exponential backoff for 429/5xx errors.

Source Tiers

Every reference is classified into one of four tiers:

  • peer_reviewed: ArXiv preprints, ACM/IEEE papers, conference proceedings
  • established_project: GitHub repos with meaningful adoption (stars, active maintenance)
  • vendor_doc: Official documentation from a company/project
  • blog_or_forum: Medium, personal blogs, Stack Overflow, Reddit

The synthesis step warns when more than half of cited sources are blog_or_forum tier.

Setup

Prerequisites

  • uv for Python dependency management
  • An OpenRouter API key (models route through OpenRouter)

Installation

  1. Clone the repository.
  2. Copy .env.example to .env and fill in your keys:
    cp .env.example .env
    
  3. Edit .env:
    OPENROUTER_API_KEY=your_key
    ENG_MODEL=openrouter/qwen/qwen3.5-flash-02-23
    RESEARCH_MODEL=openrouter/google/gemma-4-26b-a4b-it
    JUDGE_MODEL=openrouter/deepseek/deepseek-v4-flash
    GITHUB_TOKEN=            # optional, raises rate limits
    TAVILY_API_KEY=          # optional, enables Tavily search
    OPENALEX_API_KEY=         # optional, raises rate limits
    MAX_SOURCES=5            # number of sources per agent
    

Usage

The pipeline can be executed either programmatically as an MCP tool by an AI agent, or manually via the CLI.

1. Using the MCP Server (For Agents)

If you are an agent connected to this workspace, you can use the built-in MCP server to conduct a literature review without writing any configuration files.

  1. Start the MCP server:
    uv run python src/mcp_server.py
    
  2. The server exposes the conduct_literature_review tool. Formulate the topics and pass them as a JSON array along with the overarching research question directly to the tool.
  3. See the agent skill documentation at skills/lit-review-council/SKILL.md for full instructions on framing the research question and executing the pipeline.

2. Using the CLI (Manual)

  1. Define your research topics in a topics.yaml file:

    topics:
      - slug: "truth-maintenance-systems"
        description: "Core logic and caching in truth maintenance systems."
        search_keywords: ["JTMS", "ATMS", "truth maintenance"]
    
  2. Run the orchestrator pipeline:

    uv run python main.py --config topics.yaml --output okf_output --question "Overarching Research Question"
    

Output

The pipeline runs all stages for each topic, executing them in waves where possible. On completion, it generates an interconnected Markdown bundle (OKF format) in the specified output directory, including an index.md linking to each specific topic file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lit_review_council-0.1.0.tar.gz (1.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lit_review_council-0.1.0-py3-none-any.whl (25.8 kB view details)

Uploaded Python 3

File details

Details for the file lit_review_council-0.1.0.tar.gz.

File metadata

  • Download URL: lit_review_council-0.1.0.tar.gz
  • Upload date:
  • Size: 1.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for lit_review_council-0.1.0.tar.gz
Algorithm Hash digest
SHA256 59829608363d041b4c6290f127286093b47f68d5fa933feb42b0e13bf490c37c
MD5 4b445ed97828da98a59574d908dafe61
BLAKE2b-256 3bcf104c70f984bfc7d797bf5cf60b7cfaf2d8d182cf6b42a527cdeac4dd9e34

See more details on using hashes here.

File details

Details for the file lit_review_council-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: lit_review_council-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 25.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for lit_review_council-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 34c1cb5fb6194f5af76776d0c8f88d50a0b291afca41e785ff454933e585283d
MD5 0c143661afe9cde9beffc0c8b1a1a3db
BLAKE2b-256 50f6d94ca652674f063997d5cad9aa212270d121d1004d13116ea2622d369bc3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page