An academic paper writing agent based on LangGraph
Project description
seele-scholar-agent
基于 LangGraph 的学术论文写作智能体。自动生成结构化论文大纲、撰写章节、支持人工审核与多轮修订。
功能特性
- 研究检索:从 ArXiv 和 Semantic Scholar 搜索相关论文
- 大纲规划:基于检索结果生成结构化论文大纲
- 章节撰写:结合 RAG 上下文自动撰写论文章节
- 审核修订:人工审核机制,支持多轮修改
- 多模型支持:支持 OpenAI、DeepSeek、Groq 及任何 OpenAI 兼容 API
安装
# 克隆仓库
git clone https://github.com/your-org/seele-scholar-agent.git
cd seele-scholar-agent
# 使用 uv 安装(推荐)
uv sync
# 或使用 pip 安装
pip install -e .
配置
Agent 包内部配置
seele-scholar-agent 只管理自身运行所需的极少数配置,通过 src/seele_scholar_agent/.env 加载:
cp src/seele_scholar_agent/.env.example src/seele_scholar_agent/.env
| 变量 | 说明 | 默认值 |
|---|---|---|
SEMANTIC_SCHOLAR_API_KEY |
Semantic Scholar API 密钥(可选,提升频率限制) | 空 |
MAX_REVISIONS |
最大修订轮次 | 3 |
调用方配置(由你的项目管理)
LLM、向量数据库等配置由调用方自行管理,在初始化时注入到 agent:
# 你的项目 .env(示例)
OPENAI_API_KEY=sk-...
OPENAI_MODEL=gpt-4o
OPENAI_BASE_URL=https://api.openai.com/v1
支持任何 OpenAI 兼容 API,通过 ChatOpenAI 构造参数传入:
DeepSeek:
OPENAI_API_KEY = "sk-..."
OPENAI_MODEL = "deepseek-chat"
OPENAI_BASE_URL = "https://api.deepseek.com/v1"
Groq(免费 Llama):
OPENAI_API_KEY = "gsk_..."
OPENAI_MODEL = "llama-3.1-70b-versatile"
OPENAI_BASE_URL = "https://api.groq.com/openai/v1"
使用方法
快速开始
import asyncio
from datetime import datetime
from uuid import uuid4
from langchain_openai import ChatOpenAI
from seele_scholar_agent.config import settings
from seele_scholar_agent.graph import create_writing_graph
from seele_scholar_agent.state import AgentState
async def main():
# 创建 LLM(由调用方配置,支持任何 OpenAI 兼容 API)
model = ChatOpenAI(
model="gpt-4o", # 或 "deepseek-chat"、"llama-3.1-70b-versatile" 等
api_key="sk-...", # 你的 API Key
base_url="https://api.openai.com/v1", # 可替换为其他端点
temperature=0.7,
)
# 创建图
app = create_writing_graph(model=model)
# 准备初始状态
initial_state: AgentState = {
"thread_id": str(uuid4()),
"topic": "你的研究主题",
"created_at": datetime.now(),
"tenant_id": None,
"papers": [],
"search_queries": [],
"outline": None,
"outline_approved": False,
"sections": [],
"current_section_index": 0,
"sections_completed": [],
"review_history": [],
"current_review": None,
"rag_context": [],
"status": "idle",
"error_message": None,
"max_revisions": settings.MAX_REVISIONS,
"revision_count": 0,
}
# 运行图
result = await app.ainvoke(
initial_state,
config={"configurable": {"thread_id": initial_state["thread_id"]}}
)
print(f"状态: {result.get('status')}")
if result.get("outline"):
print(f"标题: {result['outline'].title}")
for s in result["outline"].sections:
print(f" {s.order}. {s.title}")
if __name__ == "__main__":
asyncio.run(main())
结合 Qdrant 使用(RAG)
from qdrant_client import QdrantClient
from langchain_openai import OpenAIEmbeddings
# 初始化 Qdrant 客户端(由调用方配置)
qdrant = QdrantClient(url="http://localhost:6333", api_key=None)
# 初始化嵌入模型(由调用方配置)
embeddings = OpenAIEmbeddings(
model="text-embedding-3-small",
api_key="sk-...",
)
# 创建带 RAG 支持的图
app = create_writing_graph(
model=model,
qdrant_client=qdrant,
embedding_model=embeddings,
)
项目结构
seele_scholar_agent/
├── config.py # 配置管理
├── state.py # Pydantic 模型和 TypedDict 状态定义
├── graph.py # LangGraph 工作流定义
├── logging.py # 结构化日志配置
└── nodes/
├── planner.py # 大纲规划节点
├── researcher.py # 论文检索节点(ArXiv、Semantic Scholar)
├── writer.py # 章节撰写节点
├── reviewer.py # 人工审核节点
└── prompts.py # LLM 提示词
AgentState 状态字段说明
initial_state 是 AgentState TypedDict,用于管理整个工作流的状态:
| 字段 | 类型 | 说明 |
|---|---|---|
thread_id |
str |
线程ID,用于对话持久化 |
topic |
str |
研究主题 |
created_at |
datetime |
创建时间 |
tenant_id |
str | None |
租户ID(多租户场景使用) |
papers |
list[PaperMetadata] |
检索到的论文列表 |
search_queries |
list[str] |
搜索查询记录 |
outline |
OutlineStructure | None |
生成的大纲结构 |
outline_approved |
bool |
大纲是否已审核通过 |
sections |
list[SectionDraft] |
拆分的章节列表 |
current_section_index |
int |
当前正在撰写的章节索引 |
sections_completed |
list[str] |
已完成的章节标题列表 |
review_history |
list[dict] |
审核历史记录 |
current_review |
ReviewResult | None |
当前审核结果 |
rag_context |
list[DocumentChunk] |
RAG 检索到的上下文 |
status |
Literal[...] |
当前状态 |
error_message |
str | None |
错误信息 |
max_revisions |
int |
最大修订次数 |
revision_count |
int |
当前修订计数 |
status 可选值: idle | researching | planning | writing | reviewing | waiting_human | completed | failed
工作流程
START → researcher → planner → [人工确认] → writer → reviewer
↓
[审核通过?]
↓
writer(下一节) 或 结束
- Researcher:从 ArXiv 和 Semantic Scholar 检索相关论文
- Planner:基于检索结果生成结构化论文大纲
- Writer:根据大纲和 RAG 上下文撰写各个章节
- Reviewer:人工审核;批准或请求修订
断点与恢复执行
graph 在 planner 节点后设置了断点(interrupt_after=["planner"]),暂停等待人工确认。
人工确认模式:
# 第一次调用,运行到断点
result = await app.ainvoke(initial_state, config={"configurable": {"thread_id": thread_id}})
# 状态变为 waiting_human,等待用户确认大纲
if result["status"] == "waiting_human":
print(f"生成的大纲: {result['outline'].title}")
# 用户确认后,更新状态并继续
app.update_state(config, {"outline_approved": True})
result = await app.ainvoke(None, config=config) # 继续执行
自动确认模式(测试用):
# 初始状态设置 outline_approved = True,跳过人工确认
initial_state["outline_approved"] = True
result = await app.ainvoke(initial_state, config=config) # 完整流程
开发
# 安装开发依赖
uv sync --extra dev
# 运行代码检查
ruff check src/
# 运行类型检查
mypy src/
# 运行测试
pytest
许可证
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file seele_scholar_agent-0.6.0.tar.gz.
File metadata
- Download URL: seele_scholar_agent-0.6.0.tar.gz
- Upload date:
- Size: 221.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.17 {"installer":{"name":"uv","version":"0.9.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
093e4030e5bfe8976750b806ca9b96d5bef4ce8e47b4c4ce78183dae73a9f019
|
|
| MD5 |
b92a270263af077a78a16ada96d91ebc
|
|
| BLAKE2b-256 |
f9bd8927edc0c33373bbf9c89c9330395a94621ae614208b41dee42b08dc8039
|
File details
Details for the file seele_scholar_agent-0.6.0-py3-none-any.whl.
File metadata
- Download URL: seele_scholar_agent-0.6.0-py3-none-any.whl
- Upload date:
- Size: 31.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.17 {"installer":{"name":"uv","version":"0.9.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f9d39c34ee5286a29e248460078d4b7a1e286394f580af95b488faf515422723
|
|
| MD5 |
ee885b98200a17ee25207708049eb8e2
|
|
| BLAKE2b-256 |
e5344c773a2e483a9beb4aa9958a0c52fc6ab97f732dd387d6f4bab035bde208
|