Adversarial multi-agent framework for paper derivation and annotation
Project description
article-learning
Adversarial multi-agent framework for automatic paper derivation and annotation. Two agent groups argue over every claim; whatever survives becomes a structured annotation.
Feed a paper (Markdown or PDF), get back structured, confidence-graded annotations — each one stress-tested by adversarial agents before it ships.
Features
- Adversarial verification — four challenger types (logic, assumption, counterexample, citation) stress-test every proposition before it's accepted
- Streaming annotations — results are written as they're produced; no need to wait for the full run to finish
- Structured output — every annotation is a Pydantic model with confidence level, derivation, citations, and full challenge history
- PDF & Markdown input — feed a
.pdf(viamarker-pdf) or.mdfile - Pluggable LLM backend — works with any OpenAI-compatible API (OpenAI, DeepSeek, local models via vLLM/Ollama)
- Dependency DAG — propositions are topologically sorted; circular dependencies are detected and handled via joint verification
- Symbol table — tracks notation across sections so the same glyph isn't silently overloaded
- Fully testable —
DeterministicMockLLMandScriptedMockLLMlet you run the entire pipeline without API keys
Architecture
+-----------------------+
| Blackboard | <- single source of truth
| (state machine, DAG, |
| symbol table, log) |
+-----------------------+
^ ^
| |
+-------------------------+ +---------------------------+
| |
+-------------------+ +----------------------+
| Group A | | Group B |
| MainAgent (DAG) | | LogicChallenger |
| SubAgent (block) | | AssumptionChallenger |
+-------------------+ | CounterexampleConst. |
| CitationChecker |
+----------------------+
|
v
streaming Annotator
(JSON now / MCP later)
Group A
MainAgentreads every semantic block, extracts propositions, builds a dependency DAG, maintains the global symbol table, and decides which proposition is next via topological order. Cycles (mutually-referential lemmas) are flagged for joint verification.SubAgentowns one proposition at a time. It produces a derivation grounded in the source block and answers Group B's questions.
Group B (structured, not random)
| Challenger | Mission |
|---|---|
LogicChallenger |
Hunt for unjustified leaps in the derivation |
AssumptionChallenger |
Question whether the stated premises actually hold |
CounterexampleConstr. |
Try to construct a concrete counterexample |
CitationChecker |
Verify quoted block text really supports the claim |
The orchestrator rotates through these every round, so pressure is diversified.
State machine
PENDING -> IN_PROGRESS -> UNDER_CHALLENGE -+-> CONFIRMED
+-> REFUTED
+-> DOUBTFUL
+-> ESCALATED
consecutive_unbroken_challenges >= soft_pass_streak-> CONFIRMEDconsecutive_unanswered >= doubt_streak-> DOUBTFULrounds_completed >= max_roundswithout a streak -> ESCALATED
Confidence grades
| Level | Meaning |
|---|---|
| STRONG | Multiple challenger types passed cleanly |
| WEAK | Confirmed but with a short streak / few challenger types |
| DOUBTFUL | A group failed to respond, or escalation could not decide |
| REFUTED | A counterexample / fatal hole was found |
Streaming annotation
Orchestrator.run(...) accepts any number of Annotator sinks. Each
proposition that exits the adversarial loop is written immediately -
you can tail -f the JSONL file while the workflow is still running.
A future MCP/PDF annotator will plug into the same protocol; nothing in the core needs to change.
Configuration
All settings are loaded from environment variables (or a .env file):
| Variable | Default | Description |
|---|---|---|
OPENAI_API_KEY |
— | API key for the LLM provider |
OPENAI_MODEL |
gpt-4o-mini |
Model name |
OPENAI_BASE_URL |
— | Override for non-OpenAI providers (e.g. DeepSeek) |
MAX_ROUNDS_PER_PROPOSITION |
4 |
Max adversarial rounds per proposition |
SOFT_PASS_STREAK |
2 |
Consecutive clean rounds to mark CONFIRMED |
DOUBT_STREAK |
2 |
Consecutive unanswered rounds to mark DOUBTFUL |
ARTICLE_LEARNING_LOG_LEVEL |
INFO |
Logging verbosity |
Mitigations against the spec's risks
| Risk | Mitigation |
|---|---|
| Hallucination propagation | Every proposition carries a verbatim SourceCitation; CitationChecker validates |
| Cross-section symbol clashes | Global SymbolTable with per-block scope; sub-agent re-renders on switch |
| Lemma circular dependencies | Blackboard.cycles() detects them; topological order defers them |
| Runaway adversarial loops | max_rounds_per_proposition, soft_pass_streak, doubt_streak limits |
Quick start
python3.11 -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
pytest # full test suite, mock LLM end-to-end
To use a real OpenAI model:
cp .env.example .env
# fill in OPENAI_API_KEY, optionally OPENAI_MODEL / OPENAI_BASE_URL
python -m article_learning.cli path/to/paper.md # see CLI section
Programmatic example
from article_learning import Orchestrator
from article_learning.annotators import JSONLAnnotator
from article_learning.ingest import PaperLoader
from article_learning.llm import OpenAIClient
paper = PaperLoader().from_text_file("paper.md")
sinks = [JSONLAnnotator("annotations.jsonl")]
final = Orchestrator(OpenAIClient()).run(paper, annotators=sinks)
print(f"Produced {len(final['annotations'])} annotations")
Import examples
One-shot pipeline helper
run_pipeline is a convenience function that handles parsing and execution
in a single call:
from article_learning import run_pipeline
from article_learning.annotators import JSONLAnnotator
from article_learning.llm import OpenAIClient
llm = OpenAIClient(model="gpt-4o")
annotations = run_pipeline(llm, "paper.md", annotators=[JSONLAnnotator("out.jsonl")])
for ann in annotations:
print(f"{ann.proposition_id}: {ann.confidence.value} — {ann.statement[:80]}")
Inspecting the Blackboard
After a run, the GraphState exposes a fully-populated Blackboard with
every proposition, its status, the challenge log, and the dependency DAG:
from article_learning import Orchestrator, Blackboard, PropositionStatus
from article_learning.ingest import load_paper
from article_learning.llm import OpenAIClient
paper = load_paper("paper.md")
state = Orchestrator(OpenAIClient()).run(paper)
bb: Blackboard = state["blackboard"]
# All confirmed propositions
confirmed = bb.by_status(PropositionStatus.CONFIRMED)
print(f"{len(confirmed)} propositions confirmed")
# Walk the dependency DAG
import networkx as nx
graph: nx.DiGraph = bb.build_graph()
for node in nx.topological_sort(graph):
prop = bb.get(node)
print(f" {node} ({prop.type.value}): {prop.statement[:60]}")
# Inspect adversarial history for a specific proposition
for record in bb.proposition_history("P1"):
print(f" round {record.round_index}: [{record.challenger}] {record.verdict}")
Working with individual models
Every model is a Pydantic BaseModel — you can construct, serialize, and
validate them independently:
from article_learning.models import (
Annotation,
ConfidenceLevel,
Proposition,
PropositionType,
PropositionStatus,
SourceCitation,
Symbol,
SymbolTable,
)
# Create a proposition manually
prop = Proposition(
proposition_id="P1",
type=PropositionType.THEOREM,
statement="If f is continuous on [0,1] then f is bounded.",
block_id="block-3",
citations=[SourceCitation(block_id="block-3", quote="f is continuous on [0,1]")],
depends_on=["P0"],
)
# Symbol table: track notation across sections
st = SymbolTable()
st.add(Symbol(
name="f",
description="Real-valued continuous function on [0,1]",
introduced_in_block="block-1",
scope_blocks=[],
))
resolved = st.lookup("f", "block-3")
print(resolved.description if resolved else "unknown symbol")
# Serialize an annotation to JSON
ann = Annotation(
proposition_id="P1",
block_id="block-3",
statement=prop.statement,
confidence=ConfidenceLevel.STRONG,
rounds=3,
)
print(ann.model_dump_json(indent=2))
Custom annotator
Implement the Annotator protocol to write annotations to any destination
(database, stdout, websocket, etc.):
from article_learning.annotators import Annotator
from article_learning.models import Annotation
class PrintAnnotator:
"""Minimal custom annotator that prints to stdout."""
def write(self, annotation: Annotation) -> None:
icon = annotation.confidence.emoji
print(f"{icon} {annotation.proposition_id}: {annotation.statement[:80]}")
def close(self) -> None:
pass
# Use it
from article_learning import Orchestrator
from article_learning.ingest import load_paper
from article_learning.llm import OpenAIClient
paper = load_paper("paper.md")
Orchestrator(OpenAIClient()).run(paper, annotators=[PrintAnnotator()])
Writing to both JSONL and a final JSON file
Combine multiple annotators to get streaming output and a single-file summary:
from article_learning import Orchestrator
from article_learning.annotators import JSONFileAnnotator, JSONLAnnotator
from article_learning.ingest import load_paper
from article_learning.llm import OpenAIClient
paper = load_paper("paper.md")
annotators = [
JSONLAnnotator("stream.jsonl"), # tail -f this while running
JSONFileAnnotator("annotations.json"), # single JSON array on close
]
Orchestrator(OpenAIClient()).run(paper, annotators=annotators)
Using a mock LLM for testing / development
DeterministicMockLLM dispatches on agent tags so you can exercise the
full pipeline without API keys:
import json
from article_learning import Orchestrator
from article_learning.ingest import PaperLoader
from article_learning.llm.mock import DeterministicMockLLM
mock = DeterministicMockLLM()
# Register handlers by agent tag
mock.register("main", lambda msgs: json.dumps({
"propositions": [
{
"proposition_id": "P1",
"type": "theorem",
"statement": "Every bounded sequence has a convergent subsequence.",
"formal_statement": None,
"block_id": "block-0",
"citation_quote": "bounded sequence ... convergent subsequence",
"depends_on": [],
}
],
"symbols": [],
}))
mock.register("sub", lambda msgs: json.dumps({
"derivation": "By the Bolzano-Weierstrass theorem.",
"extra_citations": [],
"notes": None,
}))
# Challengers: return a question on odd calls, pass on even
for tag in ("logic", "assumption", "counterexample", "citation"):
mock.register(tag, lambda msgs, t=tag: json.dumps({
"verdict": "no_issue", "question": "", "rationale": f"{t} pass"
}))
paper = PaperLoader().from_markdown("# Test\nSome math here.")
state = Orchestrator(mock).run(paper)
print(f"Annotations: {len(state['annotations'])}")
Streaming to a Rich console
StreamAnnotator writes JSON lines to any text stream — pair it with
rich.console.Console for pretty live output:
import sys
from article_learning.annotators import StreamAnnotator
from article_learning import Orchestrator
from article_learning.ingest import load_paper
from article_learning.llm import OpenAIClient
paper = load_paper("paper.md")
stream_annotator = StreamAnnotator(sys.stdout)
Orchestrator(OpenAIClient()).run(paper, annotators=[stream_annotator])
Accessing the LangGraph workflow directly
For full control over the graph (custom breakpoints, partial execution,
streaming individual nodes), use build_workflow:
from article_learning.graph import build_workflow, build_initial_state
from article_learning.ingest import load_paper
from article_learning.llm import OpenAIClient
llm = OpenAIClient()
paper = load_paper("paper.md")
workflow = build_workflow(llm, recursion_limit=300)
initial = build_initial_state(paper)
# Stream node-by-node
for event in workflow.stream(initial, stream_mode="values"):
annotations = event.get("annotations", [])
if annotations:
print(f"Got {len(annotations)} annotation(s) this step")
PDF input
Install the optional pdf extra:
pip install 'article-learning[pdf]'
Then PaperLoader().from_pdf("paper.pdf") will route through marker-pdf.
Roadmap
- MCP-backed annotator that writes directly into the source PDF.
- LLM-driven semantic segmenter to replace the rule-based first pass.
- Joint verification mode for cycle-of-lemma cases.
- Human-in-the-loop checkpoint when a proposition becomes DOUBTFUL.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file article_learning-0.3.0.tar.gz.
File metadata
- Download URL: article_learning-0.3.0.tar.gz
- Upload date:
- Size: 44.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3a8b6f05d099e5e180aeeb620beee9c514d9b7b021794d4be8fd5796c73e6ddf
|
|
| MD5 |
a05de2f39ed44e1ad4de832e551d5837
|
|
| BLAKE2b-256 |
997676cc7be407313d21f1cc047039fa005234bfcf7a2b68887504e406fdec2a
|
Provenance
The following attestation bundles were made for article_learning-0.3.0.tar.gz:
Publisher:
publish.yml on wuyouMaster/article_learning
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
article_learning-0.3.0.tar.gz -
Subject digest:
3a8b6f05d099e5e180aeeb620beee9c514d9b7b021794d4be8fd5796c73e6ddf - Sigstore transparency entry: 1459879667
- Sigstore integration time:
-
Permalink:
wuyouMaster/article_learning@56e517686b4274ce88d5ef66a8f3988ad3625458 -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/wuyouMaster
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@56e517686b4274ce88d5ef66a8f3988ad3625458 -
Trigger Event:
push
-
Statement type:
File details
Details for the file article_learning-0.3.0-py3-none-any.whl.
File metadata
- Download URL: article_learning-0.3.0-py3-none-any.whl
- Upload date:
- Size: 46.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
83ff6e40348bfce319b5b1a02abc21bc7bf4954a134478798d8c86d6403dfab6
|
|
| MD5 |
69e45a576aab71c59ec1b20abfa3480b
|
|
| BLAKE2b-256 |
59426edc9901b4b6faae907e48043ea20ff964c357976674776d204a7e4de834
|
Provenance
The following attestation bundles were made for article_learning-0.3.0-py3-none-any.whl:
Publisher:
publish.yml on wuyouMaster/article_learning
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
article_learning-0.3.0-py3-none-any.whl -
Subject digest:
83ff6e40348bfce319b5b1a02abc21bc7bf4954a134478798d8c86d6403dfab6 - Sigstore transparency entry: 1459879887
- Sigstore integration time:
-
Permalink:
wuyouMaster/article_learning@56e517686b4274ce88d5ef66a8f3988ad3625458 -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/wuyouMaster
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@56e517686b4274ce88d5ef66a8f3988ad3625458 -
Trigger Event:
push
-
Statement type: