Evidence-graph reasoning over 2.8M arXiv papers

These details have not been verified by PyPI

Project links

Homepage

Project description

EviGraph-R

EviGraph-R answers scientific questions by building an evidence graph over 2.8 million arXiv papers. It retrieves relevant chunks, constructs a graph of claims and citations, verifies each claim with an NLI judge, and synthesises a grounded answer with per-sentence citations.

Getting started

There are two ways to use EviGraph-R:

	pip install	Clone & run
Use case	Use the Python API or CLI in your own project	Self-host the full stack with your own data
Qdrant	Connects to the hosted VM (no download needed)	Runs locally via Docker
Setup	`pip install evigraph-r`	`git clone` + `docker compose up`

Quick start (pip)

pip install evigraph-r

import asyncio
import evigraph

runner = evigraph.WorkflowRunner()
request = evigraph.QueryRequest(query="What is the effect of BERT pre-training on downstream NLP tasks?")
result = asyncio.run(runner.run_query(request))

print(result.answer)
# → "BERT pre-training improves GLUE score by 7.7% [arxiv:1810.04805] ..."

Or from the terminal:

evigraph query "What causes Alzheimer's disease?"
evigraph query "What causes Alzheimer's disease?" --json   # full JSON response
evigraph serve                                             # start FastAPI on :8000

Requirements

Python ≥ 3.11
An OpenAI-compatible LLM endpoint

Configuration

The pip package connects to the hosted Qdrant instance on the EviGraph VM by default — no local database or 700 GB data download needed.

You must supply your own LLM credentials via environment variables:

export LLM_API_KEY=your-api-key
export LLM_API_BASE=https://your-llm-endpoint/v1
export LLM_MODEL=openai/your-model-name      # must include provider prefix, e.g. openai/gpt-4o

Provider prefix: DSPy (the LLM orchestration layer) requires the model name to start with a provider prefix like openai/, anthropic/, ollama/, etc. A bare model name like gpt-4o only works when LLM_API_BASE points directly at /v1.

Optional — override the Qdrant endpoint (e.g. your own instance):

export QDRANT_URL=http://your-host:6333
# or per-query:
evigraph query "..." --qdrant-url http://your-host:6333

Self-hosted (Docker)

To run everything locally with your own data:

git clone <repo-url>
cd EviGraph-R
cp .env.example .env   # fill in LLM_BASE_URL, LLM_API_KEY, etc.
docker compose up -d

Service	URL
API	http://localhost:8000
Qdrant dashboard	http://localhost:6334/dashboard

Python API

import asyncio
import evigraph

# Basic query
runner = evigraph.WorkflowRunner()
request = evigraph.QueryRequest(
    query="Does dropout improve generalisation in transformers?",
    config=evigraph.PipelineConfig(
        top_k=20,
        enable_hop=True,
        target_sections=["Results", "Discussion"],
    ),
)
result = asyncio.run(runner.run_query(request))

# Answer with citations
print(result.answer)

# Per-sentence breakdown
for sentence in result.sentences:
    print(sentence.text, "→", sentence.citations)

# Claim verdict scorecard
print(result.scorecard)
# → {"Supported": 12, "Contradicted": 2, "Inconclusive": 3}

# Evidence graph (nodes + edges)
graph: evigraph.EvidenceGraph = result.graph

`PipelineConfig` options

Parameter	Type	Default	Description
`top_k`	int	`15`	Chunks retrieved per sub-query
`score_threshold`	float	`0.0`	Minimum retrieval score
`enable_hop`	bool	`true`	Multi-hop sub-question retrieval
`embedding_model`	str	`"bge-m3"`	`bge-m3`, `e5`, `qwen3`, `jina`
`target_sections`	list\|None	`None`	Restrict to IMRaD sections e.g. `["Methods", "Results"]`

REST API

When running via Docker or evigraph serve:

Health check

GET /health

Submit a query

POST /api/v1/query
Content-Type: application/json

{
  "query": "What is the effect of BERT pre-training on downstream NLP tasks?",
  "config": { "top_k": 15, "enable_hop": true }
}

Response:

{
  "job_id": "uuid",
  "status": "completed",
  "answer": "Based on the evidence...",
  "sentences": [{ "text": "...", "citations": ["arxiv:1810.04805"] }],
  "graph": { "nodes": [...], "edges": [...] },
  "scorecard": { "Supported": 12, "Contradicted": 2, "Inconclusive": 3 },
  "elapsed_s": 14.3
}

Streaming (SSE)

GET /api/v1/query/stream?q=What+causes+Alzheimers

Emits events: decomposed → retrieved → graph_built → judged → completed

How it works

A query passes through a 5-agent LangGraph pipeline:

Query
  │
  ▼
[1] Decomposer
    Breaks the query into focused sub-queries, each tagged with
    IMRaD section targets and a retrieval budget weight.
  │
  ▼
[2] Hybrid Retriever
    BGE-M3 dense + BM25 sparse retrieval, cross-encoder reranking,
    IMRaD section-aware score boosting.
  │
  ▼
[3] Evidence Graph Builder
    Builds a graph of PAPER → CHUNK → CLAIM → CONCEPT nodes.
    Expands citations via SciCite (METHOD / BACKGROUND / RESULT_COMPARISON).
  │
  ▼
[4] Judge
    Three-route verifier: NLI batch → escalate to LLM if neutral
    → direct LLM for cross-paper contradictions.
    Verdict per claim: Supported / Contradicted / Inconclusive.
  │
  ▼
[5] Answer Generator
    Synthesises answer from supported claims only.
    Each sentence carries per-chunk citations and a verdict tag.

Stack:

Layer	Technology
API	FastAPI + Uvicorn
Workflow	LangGraph
LLM	DSPy (OpenAI-compatible)
Vector DB	Qdrant
Embeddings	BGE-M3
NLI	DeBERTa-v3-small-tasksource

Development

git clone <repo-url>
cd EviGraph-R
uv sync
cp .env.example .env

# Run tests
uv run pytest -m "not slow and not integration and not hpc"

# Type check
uv run mypy src/

License

MIT

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.1.4

May 31, 2026

0.1.3

May 31, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

evigraph_r-0.1.4.tar.gz (168.2 kB view details)

Uploaded May 31, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

evigraph_r-0.1.4-py3-none-any.whl (140.3 kB view details)

Uploaded May 31, 2026 Python 3

File details

Details for the file evigraph_r-0.1.4.tar.gz.

File metadata

Download URL: evigraph_r-0.1.4.tar.gz
Upload date: May 31, 2026
Size: 168.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for evigraph_r-0.1.4.tar.gz
Algorithm	Hash digest
SHA256	`4124d5ce1be7da781ff85e1158e50ff47d09fd4e9fa599e38b4c206337e575d4`
MD5	`772ce28fb53121180b8a853b7321af03`
BLAKE2b-256	`6f6131f89f3339c7c5a368db390727345b8d486e90bbfb8a35e34df865678c2b`

See more details on using hashes here.

File details

Details for the file evigraph_r-0.1.4-py3-none-any.whl.

File metadata

Download URL: evigraph_r-0.1.4-py3-none-any.whl
Upload date: May 31, 2026
Size: 140.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for evigraph_r-0.1.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c53b097ec25b34a7a5a0f3fab9cd15eb06752400ca2060ab2320a387441b4bf4`
MD5	`ca081a7b3ad454a7d7f3289638f98a36`
BLAKE2b-256	`161620e25b3f2b621d93f98ba849707acaf8e7aeee66922e98da78782dc79e50`

See more details on using hashes here.

evigraph-r 0.1.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

EviGraph-R

Getting started

Quick start (pip)

Requirements

Configuration

Self-hosted (Docker)

Python API

PipelineConfig options

REST API

Health check

Submit a query

Streaming (SSE)

How it works

Development

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`PipelineConfig` options