Skip to main content

Bidirectional Chinese-history time mapping with a transparent cleaning pipeline (import as `chhiskit`).

Project description

🏛️ Chinese History Toolkits

Map any year to its Chinese dynasty / reign-era / epoch — and back.

English · 中文 · Online Docs

Python License Docs Tests Doc coverage


Built on the Shanghai Library open data platform (879 raw records), with a transparent cleaning pipeline so every change traces back to the source — and a typed, dependency-light Python API on top.

✨ Features

  • 🔁 Two-way lookup — name → (begin, end) years; year → list of all parallel polities.
  • 🪨 Full timeline — 旧石器 / 新石器 prehistoric brackets through 清.
  • 🧹 Pre-cleaned data — F1 truncations dropped, F2 split-spans merged, missing endYears hand-filled, ancient-name reuses disambiguated (夏(窦建德)).
  • 📜 Auditable — every cleaning decision is one row in dynasty_drops.md with a clickable upstream URI; every modification at clean time emits one RawDataModifiedWarning.
  • 🌐 Alias-friendly — pass aliases={"新石器": {"Neolithic", "Neo"}} to accept foreign-language input.
  • Lightweight — only pandas at runtime; data ships in the repo, no network needed.

🚀 Quickstart

# From PyPI — distribution name is `chinese_history_toolkits`
pip install chinese_history_toolkits

# Or from source
git clone https://github.com/SongshGeoLab/chinese_history_toolkits.git
cd chinese_history_toolkits
uv sync --all-extras

Note — the PyPI distribution name is chinese_history_toolkits (verbose), but the import name is the short acronym chhiskit for daily use.

import chhiskit

# Name → years
chhiskit.get_age_from_cultural_period("康熙")                              # → (1662.0, 1722.0)
chhiskit.get_age_from_cultural_period("唐", level="dynasty")               # → (618.0, 907.0)
chhiskit.get_age_from_cultural_period("新石器", level="epoch")             # → (-10000.0, -2070.0)

# Year → matching polities (multiple are normal — 三国, 隋末, etc.)
[m.dynasty_id for m in chhiskit.get_cultural_periods_from_year(250)]
# → ['三国', '吴', '蜀', '魏']

# BP convention (radiocarbon, 1950 reference)
chhiskit.get_age_from_cultural_period("商", level="dynasty", anno_domini=False)
# → (3509.0, 3073.0)

# Foreign aliases
chhiskit.get_age_from_cultural_period(
    "Neolithic", level="epoch",
    aliases={"新石器": {"Neolithic", "Neo"}},
)
# → (-10000.0, -2070.0)

🏗️ Architecture at a glance

data.library.sh.cn  ─►  scrape  ─►  validate  ─►  clean  ─►  dynasty_clean.csv (854)
                                                              dynasty_drops.md   (audit)
                                                                       │
                                                                       ▼
                                                       src/chhiskit/core/dynasties.py
                                                       (runtime API, two functions)
Stage Script Output
scrape scripts/dynasties/scrape_dynasty.py dynasty_temporal.csv (879 rows)
validate scripts/dynasties/validate_dynasties.py dynasty_issues.csv (189 flagged)
clean scripts/dynasties/clean_dynasties.py dynasty_clean.csv + dynasty_drops.md

The runtime API only reads dynasty_clean.csv. To change a cleaning rule, edit the script and re-run — the diff in dynasty_clean.csv and dynasty_drops.md makes the change reviewable.

📚 Documentation

📖 Quick Start Install + first lookup, 5 minutes
🧹 Data Pipeline Scrape → validate → clean explained
📚 API Reference Every parameter, with worked examples
🗺️ Epochs Reference EPOCH_MAP + PREHISTORIC_EPOCHS

Build the site locally:

make docs       # serve at http://127.0.0.1:8000
make docs-build # static build

🧪 Development

make test                       # pytest
pre-commit run --all-files      # black + ruff + flake8 + mypy + interrogate
make tox                        # Python 3.10–3.13 matrix

The test file tests/test_dynasties.py is the executable spec — 103 cases organized into one class per behavior cluster, each with a docstring explaining what it pins.

📄 Data attribution

Source: 上海图书馆开放数据平台 (data.library.sh.cn). All cleaning decisions are documented in data/dynasties/dynasty_drops.md with a clickable URI back to the source for each modified row.

🤝 Contributing

PRs welcome. Please:

  1. Run pre-commit run --all-files and make test (must pass).
  2. If you change cleaning rules, regenerate dynasty_clean.csv and dynasty_drops.md and commit both — the diff is your change's audit trail.
  3. New behavior → new test in tests/test_dynasties.py with a docstring describing what it pins.

📜 License

MIT — see LICENSE.

👤 Author

SongshGeo · GitHub · Website

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chinese_history_toolkits-0.1.1.tar.gz (42.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

chinese_history_toolkits-0.1.1-py3-none-any.whl (39.9 kB view details)

Uploaded Python 3

File details

Details for the file chinese_history_toolkits-0.1.1.tar.gz.

File metadata

  • Download URL: chinese_history_toolkits-0.1.1.tar.gz
  • Upload date:
  • Size: 42.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.9 {"installer":{"name":"uv","version":"0.11.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for chinese_history_toolkits-0.1.1.tar.gz
Algorithm Hash digest
SHA256 e32857485919ff5c8d694bce7d6d44e354cf844e9bd0549a7cfc51fc3496a17e
MD5 96db7dcdbb73d28cc1a4f50ca9a348d4
BLAKE2b-256 823774e6cdb4d38bd76ded3a01f8a16dc44cece0a5b023a74aa27326df127319

See more details on using hashes here.

File details

Details for the file chinese_history_toolkits-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: chinese_history_toolkits-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 39.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.9 {"installer":{"name":"uv","version":"0.11.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for chinese_history_toolkits-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 57ed5453cc5a7f200d9e04c016857e219ab68732fed05b59e93e32e9308ab903
MD5 67b6a9de39bfd9d911ae33cf59a79b56
BLAKE2b-256 560868650b8ee4c206554bc74a3af4237c0d3ba50c94237e35fdc163845a2014

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page