AI-powered literature reading assistant with multi-agent orchestration and hybrid RAG

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

0verL1nk

These details have not been verified by PyPI

Project description

📚 PaperSage

面向科研阅读与写作的 AI 智能体工作台

English · 简体中文 · CHANGELOG · 文档

PaperSage 主界面

基于 Streamlit + LangChain + LangGraph 构建。
以"项目化论文问答工作台"为核心：按项目组织文档、限定检索范围、自动路由 Agent 工作流、输出可追溯证据。

✨ 核心能力一览

能力	说明
🔀 多模式 Agent 工作流	ReAct / Plan-Act / Plan-Act-RePlan 三级工作流，智能路由自动选择
🤝 Multi-Agent 团队协作	Leader 中心调度，LLM 动态生成角色，依赖拓扑派发，多轮 review-replan
🔍 本地 Hybrid RAG	Dense + BM25 + RRF + Rerank 四阶检索，结构化证据可追溯至原文
🧠 长短期记忆系统	episodic / semantic / procedural 三类记忆，差异化 TTL，时效衰减检索
🛠️ 14+ 内置工具	RAG 检索、文件读写、学术搜索、网络检索、Todo 管理、人工确认等
📝 可插拔技能体系	论文总结、批判性阅读、方法对比、翻译、思维导图，从 SKILL.md 动态加载
🗂️ 项目化工作区	多项目隔离、文档绑定、独立会话与上下文

🖼️ 功能截图

Agent 中心 — 智能问答

alt text

文件中心 — 文档管理

alt text

论文问答 — 证据追溯

alt text

思维导图 — 可视化

alt text

论文总结

alt text

上下文治理 — 可视化

上下文可视化

🏗️ 架构设计

工作流路由与调度

用户提问
  │
  ├─→ 智能路由（关键词快速路由 → LLM 结构化路由 → 兜底 ReAct）
  │
  ├─ ReAct 模式 ──────→ 单 Agent + Tool 直接回答
  ├─ Plan-Act 模式 ───→ Planner 生成计划 → Leader 执行
  └─ Plan-Act-RePlan ─→ Planner → Leader ⇄ Team（多角色） → Reviewer → RePlan
                                                                    ↓
                                                              质量门控循环

Hybrid RAG 检索管线

用户 Query
  │
  ├─→ Dense 检索（FastEmbed bge-small-zh）
  ├─→ BM25 稀疏检索
  │         │
  │         ├─→ RRF 融合排序
  │         │         │
  │         │         ├─→ FlashRank Rerank（可选）
  │         │         │         │
  │         │         │         └─→ 邻域 Chunk 扩展
  │         │         │                   │
  └─────────┴─────────┴───────────────────┴─→ 结构化 EvidenceItem
                                                (doc_uid / chunk_id / score / page_no / offset)

长短期记忆架构

┌─────────────────────────────────────────────┐
│                 记忆三层架构                    │
├─────────────────────────────────────────────┤
│  短期：LangGraph InMemorySaver（当前会话）     │
├─────────────────────────────────────────────┤
│  中期：上下文自动压缩                          │
│  （超 Token 阈值 → LLM 摘要 + 事实锚点提取）    │
├─────────────────────────────────────────────┤
│  长期：SQLite 持久化（按项目/用户隔离）          │
│  ├─ episodic（事件）  TTL 30 天               │
│  ├─ semantic（知识）  永久保留                  │
│  └─ procedural（偏好）TTL 90 天               │
│  检索：词项匹配 + 时效衰减评分                   │
│  注入：容量熔断 + 冲突消解（证据优先于记忆）       │
└─────────────────────────────────────────────┘

📄 页面导航

页面	说明
🤖 Agent 中心（默认）	智能问答主界面，工作流可视化，证据展示
📁 文件中心	文档上传、格式转换、内容预览
⚙️ 设置中心	API Key、模型、RAG 参数、Agent 行为配置
🗂️ 项目中心	项目管理、文档绑定、工作区切换

🚀 快速开始

环境要求

Python >= 3.10
uv（推荐包管理器）

本地启动

# 1. 安装依赖
uv sync --no-install-project

# 2. 启动应用
streamlit run main.py

浏览器访问 http://localhost:8501，在"⚙️ 设置中心"填写 API Key 和模型配置即可使用。

Docker 部署

docker-compose up --build

docker-compose 模式默认启用 MinerU 解析（DOC_PARSE_BACKEND=mineru）。
直接本地 streamlit run main.py 不会启用 MinerU，仍使用本地解析链路（MarkItDown / PyMuPDF）。
若本地没有 mineru:latest 镜像，请先按 MinerU 官方文档构建或在 .env 中改 MINERU_IMAGE。

🗂️ 项目结构

.
├── main.py                     # Streamlit 导航入口
├── pages/                      # 四个功能页面
├── agent/                      # 🧠 Agent 核心（77 个模块 / 12,500+ 行）
│   ├── a2a/                    #   A2A 协调与协议层（状态机/路由/RePlan）
│   ├── orchestration/          #   Leader 中心编排（策略引擎/规划/团队执行）
│   ├── rag/                    #   Hybrid RAG（切分/检索/证据/融合）
│   ├── memory/                 #   长期记忆（分类/检索/存储/注入）
│   ├── skills/                 #   可插拔技能（summary/critical_reading/...）
│   ├── tools/                  #   内置工具（文件/todo/bash/ask_human）
│   ├── domain/                 #   领域契约
│   ├── application/            #   应用用例编排
│   └── adapters/               #   外部依赖适配层
├── ui/                         # UI 组件层
├── utils/                      # 共享工具函数
├── tests/                      # 单元测试 53 个 + 集成测试 + Eval
├── docs/                       # 设计文档与开发记录
├── models/embeddings/          # 本地嵌入模型缓存
├── pyproject.toml              # 项目配置（hatch + uv）
├── Dockerfile                  # 容器构建
└── docker-compose.yml          # 容器编排

⚙️ 主要环境变量

点击展开完整配置

# LLM 接入
OPENAI_COMPATIBLE_BASE_URL=https://dashscope.aliyuncs.com/compatible-mode/v1

# RAG
LOCAL_RAG_HYBRID_ENABLED=true
LOCAL_RAG_TOP_K=8
LOCAL_RAG_RERANK_ENABLED=false

# Agent 行为
AGENT_TEMPERATURE=0.1
AGENT_ENABLE_THINKING=false
AGENT_REASONING_EFFORT=

# 编排与团队
AGENT_TEAM_MAX_MEMBERS=3
AGENT_TEAM_MAX_ROUNDS=2
AGENT_PLANNER_MIN_STEPS=2
AGENT_PLANNER_MAX_STEPS=4

# 路由策略阈值
AGENT_POLICY_SCORE_PLAN=2
AGENT_POLICY_SCORE_TEAM=4

# 工具开关
AGENT_DISABLE_SEARCH_WEB=false
AGENT_TODO_FILE=.agent/todo.json
AGENT_HISTORY_PAGE_SIZE=40
AGENT_PROJECT_INDEX_CACHE_DIR=./.cache/project_indexes

# 日志
APP_LOG_LEVEL=INFO

🧩 工具加载与 Schema 暴露

为降低工具数量增长带来的上下文开销，运行时采用“工具已注册 + Schema 按级别暴露”的策略。

项目	说明
`tool_load` 事件	仅输出摘要（`registered/schema_ready/schema_lazy + tools preview`），避免一次性展开全部工具详情
`AGENT_TOOL_SCHEMA_LEVEL=manifest`（默认）	仅暴露工具元信息（`name/description`），不注入参数 JSON Schema
`AGENT_TOOL_SCHEMA_LEVEL=compact`	暴露轻量参数摘要（字段名 + required）
`AGENT_TOOL_SCHEMA_LEVEL=full`	暴露完整 JSON Schema（调试/开发场景建议按需开启）

示例：

# 默认推荐：最小上下文占用
AGENT_TOOL_SCHEMA_LEVEL=manifest

🧪 测试

# 安装开发依赖
uv sync --extra dev --no-install-project

# 单元测试
uv run --extra dev python -m pytest tests/unit -q

# 集成测试
uv run --extra dev python -m pytest tests/integration -q

# Live API E2E（需配置真实 API Key）
uv run --extra dev python -m pytest tests/integration/test_live_api_e2e.py -q

📦 技术栈

技术	用途
Streamlit	Web UI 框架
LangChain / LangGraph	LLM 编排与 Agent 状态机
FastEmbed (bge-small-zh)	本地向量嵌入
FlashRank	本地 Rerank
rank_bm25	稀疏检索
a2a-sdk	Google A2A 协议兼容
SQLite	记忆与数据持久化
Redis + RQ	异步任务队列
pyecharts	思维导图可视化
Docker	容器化部署

📄 License

MIT

🤝 贡献

欢迎提交 Issue / PR ❤️

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

0verL1nk

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.1.0

Mar 21, 2026

1.0.5

Mar 16, 2026

1.0.3

Mar 7, 2026

This version

1.0.2

Mar 7, 2026

1.0.1

Mar 6, 2026

1.0.0

Mar 6, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

paper_sage-1.0.2.tar.gz (164.4 kB view details)

Uploaded Mar 7, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

paper_sage-1.0.2-py3-none-any.whl (204.6 kB view details)

Uploaded Mar 7, 2026 Python 3

File details

Details for the file paper_sage-1.0.2.tar.gz.

File metadata

Download URL: paper_sage-1.0.2.tar.gz
Upload date: Mar 7, 2026
Size: 164.4 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for paper_sage-1.0.2.tar.gz
Algorithm	Hash digest
SHA256	`69ff8406c120773139f4f950d3449198f4dff113c970cb11f387952b0bd2f4ef`
MD5	`2c50cefb55b385771260c98ef9fc01f7`
BLAKE2b-256	`d32345f3d9ef1f2157b6bdab03c89e99ac67806c8750986178f49884af3e6164`

See more details on using hashes here.

Provenance

The following attestation bundles were made for paper_sage-1.0.2.tar.gz:

Publisher: publish.yml on 0verL1nk/PaperSage

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: paper_sage-1.0.2.tar.gz
- Subject digest: 69ff8406c120773139f4f950d3449198f4dff113c970cb11f387952b0bd2f4ef
- Sigstore transparency entry: 1054426524
- Sigstore integration time: Mar 7, 2026
Source repository:
- Permalink: 0verL1nk/PaperSage@efb33c4e7a9c8e93fcc0df83dbfa6bc65242bf26
- Branch / Tag: refs/tags/v1.0.2
- Owner: https://github.com/0verL1nk
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@efb33c4e7a9c8e93fcc0df83dbfa6bc65242bf26
- Trigger Event: push

File details

Details for the file paper_sage-1.0.2-py3-none-any.whl.

File metadata

Download URL: paper_sage-1.0.2-py3-none-any.whl
Upload date: Mar 7, 2026
Size: 204.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for paper_sage-1.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`841f60c62064e388163eeb51a328b07beca3b5f44544cb332868f258684ebd9e`
MD5	`86ec0b201deb02650c534476d3937d96`
BLAKE2b-256	`7a36c1a7f1041230427641383f98903c4532f0a28b8a5d375d05b94c5e1bbcd8`

See more details on using hashes here.

Provenance

The following attestation bundles were made for paper_sage-1.0.2-py3-none-any.whl:

Publisher: publish.yml on 0verL1nk/PaperSage

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: paper_sage-1.0.2-py3-none-any.whl
- Subject digest: 841f60c62064e388163eeb51a328b07beca3b5f44544cb332868f258684ebd9e
- Sigstore transparency entry: 1054426554
- Sigstore integration time: Mar 7, 2026
Source repository:
- Permalink: 0verL1nk/PaperSage@efb33c4e7a9c8e93fcc0df83dbfa6bc65242bf26
- Branch / Tag: refs/tags/v1.0.2
- Owner: https://github.com/0verL1nk
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@efb33c4e7a9c8e93fcc0df83dbfa6bc65242bf26
- Trigger Event: push

paper-sage 1.0.2

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

📚 PaperSage

✨ 核心能力一览

🖼️ 功能截图

Agent 中心 — 智能问答

文件中心 — 文档管理

论文问答 — 证据追溯

思维导图 — 可视化

论文总结

上下文治理 — 可视化

🏗️ 架构设计

工作流路由与调度

Hybrid RAG 检索管线

长短期记忆架构

📄 页面导航

🚀 快速开始

环境要求

本地启动

Docker 部署

🗂️ 项目结构

⚙️ 主要环境变量

🧩 工具加载与 Schema 暴露

🧪 测试

📦 技术栈

📄 License

🤝 贡献

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance