A traceable Chinese-history MCP server: 4 tools over 9 classical texts, every result cited (book→chapter→paragraph). Zero runtime dependencies.

These details have not been verified by PyPI

Project links

Project description

chinese-history-mcp

A traceable Chinese-history MCP server. Four Model Context Protocol tools over 9 classical Chinese texts (pre-Qin to Wei-Jin — 史记 / 汉书 / 后汉书 / 三国志 / 左传 / 论语 / 孟子 / 吕氏春秋 / 资治通鉴). Every result carries a 【book → chapter → paragraph】 citation, and honestly reports its review_status — the server never claims per-item human review it doesn't have.

一个可溯源的中国历史故事 MCP server：按事件 / 人物 / 今地名 / 品质四轴查询先秦-汉魏九部正史子书，每条返回都带原文出处，机器生成/机审内容如实标注。

Demo — every result is cited

Zero runtime dependencies — pure Python standard library. No pip install of a framework, no MCP SDK; the whole server is auditable in a few files.
Read-only — opens the corpus with mode=ro + PRAGMA query_only; never writes.
Honest by construction — machine-generated punctuation / translation and machine-adjudicated status are labeled in every response (AIGC-compliant).

Why this exists: as of mid-2026 the public MCP ecosystem has no classical Chinese / Chinese-history server. This fills that gap. Income expectation is zero; the goal is a useful public good.

Contents: The four tools · Install & run · The corpus database · Honesty · Data & provenance · Design notes

The four tools

tool	input	returns
`search_events`	`keyword` / `book` / `person` / `limit`	Cross-book fused historical events with per-source provenance (book · chapter · paragraph + role: primary/detailed/brief/comment/corroborating). `canonical_summary` is an LLM-fused machine narrative.
`get_person`	`name` (given name or alias)	Person profile (LLM-synthesized, `draft`) + others' appraisals (verbatim source quotes, each cited) + attributed qualities + events mentioning them.
`query_by_place`	`place` (today's place name) / `limit`	Ancient stories set on the land of a modern place, with citations. Same-name-different-place returns candidates for you to disambiguate — it never silently picks one. Directional/regional generic names are excluded.
`query_by_quality`	`quality` (from a 55-term controlled vocabulary, e.g. 忠 loyalty, 谋略 strategy) / `limit` / `include_draft`	Representative events and people for a quality, each with an original-text `evidence_quote` and rationale.

Each tool call returns JSON. Multi-source events, person appraisals, and place/quality edges all carry the exact 【book → chapter → paragraph】 they came from — that is the point of the server.

Install & run

Requires Python 3.7+ (standard library only). The server speaks MCP over stdio (newline-delimited JSON-RPC 2.0).

# 1. Get the corpus DB (see "The corpus database" below), then:
PYTHONPATH=src python3 -m storyextractor.mcp.server --db /path/to/corpus.db

Or pip install . to get a chinese-history-mcp console command:

pip install .
chinese-history-mcp --db /path/to/corpus.db

Configure in an MCP client

Claude Desktop (claude_desktop_config.json), Cline, Continue, etc. — add one stdio server:

{
  "mcpServers": {
    "chinese-history": {
      "command": "python3",
      "args": ["-m", "storyextractor.mcp.server", "--db", "/path/to/corpus.db"],
      "env": { "PYTHONPATH": "src" },
      "cwd": "/absolute/path/to/chinese-history-mcp"
    }
  }
}

(After pip install . you can instead use "command": "chinese-history-mcp", "args": ["--db", "/path/to/corpus.db"] and drop env/cwd.)

Try one handshake by hand

printf '%s\n' \
  '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2025-06-18","capabilities":{}}}' \
  '{"jsonrpc":"2.0","id":2,"method":"tools/list"}' \
  '{"jsonrpc":"2.0","id":3,"method":"tools/call","params":{"name":"query_by_quality","arguments":{"quality":"忠","limit":2}}}' \
  | PYTHONPATH=src python3 -m storyextractor.mcp.server --db /path/to/corpus.db

Demo + hallucination comparison

python3 scripts/mcp_demo.py --db /path/to/corpus.db runs a scripted tour of all four tools (also a minimal MCP-client reference). See docs/MCP_DEMO.md for a side-by-side of a bare LLM (fabricated / uncitable) vs. this server (cited) on the same questions.

The corpus database

corpus.db is not in this repository (it is a ~90 MB binary). Download it from this repo's Releases and point --db at it, or set STORYEXTRACTOR_DB=/path/to/corpus.db.

The database is read-only at runtime. If you host it on a read-only medium, make sure the release artifact was produced with sqlite3 corpus.db "VACUUM INTO 'corpus_release.db'" (single file, no -wal/-shm sidecars).

Honesty (please read)

This server is designed for provenance, not to launder machine output as scholarship. Downstream clients and LLMs must not present its results as "individually human-reviewed." Every response labels what it is:

Events review_status='approved' — mostly machine bulk-approved credible inferences, not per-item human review.
Person profiles review_status='draft' — LLM-synthesized, not human-vetted.
Quality mappings — auto_approved = multi-LLM machine consensus, draft = pending review; evidence_quote is a real substring of the source, rationale is an LLM's reasoning.
Place mappings — mostly multi-LLM machine consensus (auto_approved), a few human-approved; confidence is bucketed high/medium/doubtful.
Text — original is public-domain 白文 with machine-generated punctuation/segmentation; vernacular translation is fully machine-generated.

The server also does not eliminate downstream hallucination: it gives you citable retrieval facts; an LLM built on top can still confabulate around them. The citations are anchors for human verification.

Scope is the 9 texts above — "not found" means "not in this corpus," not "did not happen."

Data & provenance

Original text: public-domain classical Chinese 白文 (unpunctuated base text from public-domain editions), with self-produced, machine-generated punctuation and segmentation (not copied from any modern annotated/collated edition).
Vernacular translation: machine-generated across the whole corpus.
Annotations (events / entities / places / qualities): machine-assisted, with human review gating on selected layers; status is reported per record.

License

Code (this repository): MIT — see LICENSE.
Corpus data (corpus.db, distributed via Releases): CC BY 4.0.

The text layer is self-produced (punctuation/segmentation) over public-domain base text, so it is distributed freely; machine-generated attributes are labeled throughout for AIGC compliance.

Design notes

Pure stdlib hand-written stdio JSON-RPC 2.0 (initialize / tools/list / tools/call + ping / notifications). No third-party MCP SDK.
Read-only DB access (src/storyextractor/mcp/db.py): mode=ro + PRAGMA query_only; the migration-running db.connect is never used at serve time.
Tests: python3 tests/test_mcp_server.py (read-only enforcement, protocol shapes/error codes, honest review_status, alias token-exact matching + disambiguation, LIKE-wildcard escaping) — builds a temporary fixture DB, so it runs without corpus.db.

Contributing & project meta

CONTRIBUTING.md — how to run tests/lint and the principles this project holds to.
CHANGELOG.md — release history.
SECURITY.md — threat surface (read-only, no network) and how to report issues.

Issues and pull requests are welcome. Please keep the constraints in mind: zero runtime dependencies, read-only, every result cited, honest review_status.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.1

Jul 4, 2026

This version

0.1.0

Jul 4, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chinese_history_mcp-0.1.0.tar.gz (34.4 kB view details)

Uploaded Jul 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

chinese_history_mcp-0.1.0-py3-none-any.whl (33.6 kB view details)

Uploaded Jul 4, 2026 Python 3

File details

Details for the file chinese_history_mcp-0.1.0.tar.gz.

File metadata

Download URL: chinese_history_mcp-0.1.0.tar.gz
Upload date: Jul 4, 2026
Size: 34.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.13

File hashes

Hashes for chinese_history_mcp-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`fd4e69e06dbcdf2226217d454c663b8fe35f4594f23483896c17e8f4ae98a5af`
MD5	`cc635c7c0cd9d403b12cad2c79922a4d`
BLAKE2b-256	`ab423e8ab0e15d1b0354c6cb4cd8041ec42b2c240d8158aed06bbb782adbec2f`

See more details on using hashes here.

File details

Details for the file chinese_history_mcp-0.1.0-py3-none-any.whl.

File metadata

Download URL: chinese_history_mcp-0.1.0-py3-none-any.whl
Upload date: Jul 4, 2026
Size: 33.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.13

File hashes

Hashes for chinese_history_mcp-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5369bc162fc9be4f5081bfd4eedb4708785c1e7d4ae841989c375c6e1138484d`
MD5	`552339b05c9b8bcca33e6aecf45676c0`
BLAKE2b-256	`4c32ffe9c08158ba3f7eaed74eaea9f172e8757ca9c52ef2457062b5a3ac24f5`

See more details on using hashes here.

chinese-history-mcp 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

chinese-history-mcp

The four tools

Install & run

Configure in an MCP client

Try one handshake by hand

Demo + hallucination comparison

The corpus database

Honesty (please read)

Data & provenance

License

Design notes

Contributing & project meta

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes