Bidirectional Chinese-history time mapping with a transparent cleaning pipeline (import as `chhiskit`).
Project description
🏛️ Chinese History Toolkits
Map any year to its Chinese dynasty / reign-era / epoch — and back.
English · 中文 · Online Docs
Built on the Shanghai Library open data platform (879 raw records), with a transparent cleaning pipeline so every change traces back to the source — and a typed, dependency-light Python API on top.
✨ Features
- 🔁 Two-way lookup — name →
(begin, end)years; year → list of all parallel polities. - 🪨 Full timeline — 旧石器 / 新石器 prehistoric brackets through 清.
- 🧹 Pre-cleaned data — F1 truncations dropped, F2 split-spans merged, missing endYears hand-filled, ancient-name reuses disambiguated (
夏(窦建德)≠夏). - 📜 Auditable — every cleaning decision is one row in
dynasty_drops.mdwith a clickable upstream URI; every modification at clean time emits oneRawDataModifiedWarning. - 🌐 Alias-friendly — pass
aliases={"新石器": {"Neolithic", "Neo"}}to accept foreign-language input. - ⚡ Lightweight — only
pandasat runtime; data ships in the repo, no network needed.
🚀 Quickstart
# From PyPI — distribution name is `chinese_history_toolkits`
pip install chinese_history_toolkits
# Or from source
git clone https://github.com/SongshGeoLab/chinese_history_toolkits.git
cd chinese_history_toolkits
uv sync --all-extras
Note — the PyPI distribution name is
chinese_history_toolkits(verbose), but the import name is the short acronymchhiskitfor daily use.
import chhiskit
# Name → years
chhiskit.get_age_from_cultural_period("康熙") # → (1662.0, 1722.0)
chhiskit.get_age_from_cultural_period("唐", level="dynasty") # → (618.0, 907.0)
chhiskit.get_age_from_cultural_period("新石器", level="epoch") # → (-10000.0, -2070.0)
# Year → matching polities (multiple are normal — 三国, 隋末, etc.)
[m.dynasty_id for m in chhiskit.get_cultural_periods_from_year(250)]
# → ['三国', '吴', '蜀', '魏']
# BP convention (radiocarbon, 1950 reference)
chhiskit.get_age_from_cultural_period("商", level="dynasty", anno_domini=False)
# → (3509.0, 3073.0)
# Foreign aliases
chhiskit.get_age_from_cultural_period(
"Neolithic", level="epoch",
aliases={"新石器": {"Neolithic", "Neo"}},
)
# → (-10000.0, -2070.0)
🏗️ Architecture at a glance
data.library.sh.cn ─► scrape ─► validate ─► clean ─► dynasty_clean.csv (854)
dynasty_drops.md (audit)
│
▼
src/chhiskit/core/dynasties.py
(runtime API, two functions)
| Stage | Script | Output |
|---|---|---|
| scrape | scripts/dynasties/scrape_dynasty.py |
dynasty_temporal.csv (879 rows) |
| validate | scripts/dynasties/validate_dynasties.py |
dynasty_issues.csv (189 flagged) |
| clean | scripts/dynasties/clean_dynasties.py |
dynasty_clean.csv + dynasty_drops.md |
The runtime API only reads dynasty_clean.csv. To change a cleaning rule, edit the script and re-run — the diff in dynasty_clean.csv and dynasty_drops.md makes the change reviewable.
📚 Documentation
| 📖 Quick Start | Install + first lookup, 5 minutes |
| 🧹 Data Pipeline | Scrape → validate → clean explained |
| 📚 API Reference | Every parameter, with worked examples |
| 🗺️ Epochs Reference | EPOCH_MAP + PREHISTORIC_EPOCHS |
Build the site locally:
make docs # serve at http://127.0.0.1:8000
make docs-build # static build
🧪 Development
make test # pytest
pre-commit run --all-files # black + ruff + flake8 + mypy + interrogate
make tox # Python 3.10–3.13 matrix
The test file tests/test_dynasties.py is the executable spec — 103 cases organized into one class per behavior cluster, each with a docstring explaining what it pins.
📄 Data attribution
Source: 上海图书馆开放数据平台 (data.library.sh.cn). All cleaning decisions are documented in data/dynasties/dynasty_drops.md with a clickable URI back to the source for each modified row.
🤝 Contributing
PRs welcome. Please:
- Run
pre-commit run --all-filesandmake test(must pass). - If you change cleaning rules, regenerate
dynasty_clean.csvanddynasty_drops.mdand commit both — the diff is your change's audit trail. - New behavior → new test in
tests/test_dynasties.pywith a docstring describing what it pins.
📜 License
MIT — see LICENSE.
👤 Author
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file chinese_history_toolkits-0.1.1.tar.gz.
File metadata
- Download URL: chinese_history_toolkits-0.1.1.tar.gz
- Upload date:
- Size: 42.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.9 {"installer":{"name":"uv","version":"0.11.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e32857485919ff5c8d694bce7d6d44e354cf844e9bd0549a7cfc51fc3496a17e
|
|
| MD5 |
96db7dcdbb73d28cc1a4f50ca9a348d4
|
|
| BLAKE2b-256 |
823774e6cdb4d38bd76ded3a01f8a16dc44cece0a5b023a74aa27326df127319
|
File details
Details for the file chinese_history_toolkits-0.1.1-py3-none-any.whl.
File metadata
- Download URL: chinese_history_toolkits-0.1.1-py3-none-any.whl
- Upload date:
- Size: 39.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.9 {"installer":{"name":"uv","version":"0.11.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
57ed5453cc5a7f200d9e04c016857e219ab68732fed05b59e93e32e9308ab903
|
|
| MD5 |
67b6a9de39bfd9d911ae33cf59a79b56
|
|
| BLAKE2b-256 |
560868650b8ee4c206554bc74a3af4237c0d3ba50c94237e35fdc163845a2014
|