Skip to main content

Open-source framework for reading Andean khipus — numerical (Locke) and textual (ALBA syllabary) channels.

Project description

KhipuReader

Leer en español en GitHub

An open-source tool for reading Andean khipus — translation engine, validated corpus, and reproducibility scripts for the ALBA syllabary (Sivan, 2026).

619 khipus survive in museums around the world. Each one is a knotted-cord document from the Inca Empire — a tax record, an astronomical journal, a legal proceeding, a census, a map. We can read the numbers (Locke 1923). We are starting to read the words (ALBA syllabary, Sivan 2026).

KhipuReader translates any khipu from the Open Khipu Repository — install it, pick a khipu, and read what's on it.

For those who want to go further, the project also hosts a community effort to build the reconstructed library of the Inca Empire — one khipu at a time.

Community progress

[======>                        ] 70/619 khipus analyzed (11.3%)
Khipu Type Summary
AS069 Astronomical catalog Observation catalog, dated April 1453 CE, Lluta Valley
AS075 Pilgrimage register Pachacamac oracle ceremonies, 186 cords
AS076 Naming ceremony Identity declaration (rutuchikuy), Paris
AS077 Zone inventory 4 geographic zones, regular 4-column format, Paris
AS080 Cadastral survey 6-step surveyor's route with landmarks, Paris
HP020 Cadastral survey Location instruction, Pachacamac
UR006 Astronomical journal 24 months x 9 columns, dated June 1473 CE, Leymebamba
UR050 Lineage land registry Cadastral record on 164 all-white cords
UR051 Labor corvée Corvée register with chiastic poetic structure, 98 cords
UR054 Physical assets register Blue twin of UR050 (white land registry), 115 cords
UR055 Succession oracle Administrator consults oracle, 180 cords, dated Feb 1519 CE
UR176 Judicial proceeding Murder of Chuquitanta — mother slaughtered, the falcon condemned
UR193 Oracle consultation Consultation register from Pachacamac, 41 sessions
UR1091 Judicial proceeding Murder case — death periphrasis (tana mana...llapa)

Run khipu progress to generate the full progress report, or see PROGRESS.md.


Quick start

Install

pip install khipu-reader

On first use, the tool automatically downloads the Open Khipu Repository database (~50 MB).

Translate a khipu

khipu translate UR039 --lang en

Find similar khipus

khipu suggest UR039

Compare two khipus

khipu compare UR039 UR144

Contribute your reading

khipu submit UR039          # generates contributions/UR039.json
# Edit the file, add your analysis
# Submit a Pull Request

See what's left to do

khipu unclaimed             # 549 khipus waiting to be read

How it works

Khipus encode information on two channels:

Channel Component Decoding
Numbers Simple knots (S-type) Locke decimal system (1923) — established
Text Long knots + figure-eight ALBA syllabary (Sivan 2026) — proposed

5.4% of knotted cords carry multiple long/figure-eight knots, making them incompatible with the decimal system. These "STRING" cords are candidates for textual encoding.

The ALBA syllabary v3

Knot Turns Onset Coda Confidence
L0 0 lla lla High
L2 2 chi ki High
L3 3 ma ma High
L4 4 ka ka High
L5 5 ta ta High
L6 6 pa pa High
L7 7 wa y High
L8 8 cha na High
L9 9 pi pi High
L10 10 si si Medium
L11 11 ti ti Low
L12 12 ku ku Low
E fig-8 qa High

16 effective symbols. Three onset variants (wa/y, cha/na, chi/ki) follow natural phonological patterns.

Research hypothesis. The ALBA syllabary is a proposed decipherment (p = 0.001) — not a confirmed reading system. Use with scholarly caution.


CLI reference

Command Description
khipu translate ID Translate a khipu (summary + optional exports)
khipu suggest ID Find the 5 most similar khipus
khipu compare ID1 ID2 Side-by-side comparison
khipu unclaimed List unanalyzed khipus
khipu submit ID Generate contribution template (JSON)
khipu progress Generate PROGRESS.md
khipu list List all 619 khipus
khipu search KEYWORD Search by provenance, museum, ID
khipu info ID Show khipu metadata
khipu syllabary Print the ALBA syllabary

All commands accept --db path/to/khipu.db to use a local database.


How to contribute

You don't need to be a Quechua speaker. There are 4 levels:

Level 1 — Triage

Run khipu translate and describe what you see: "looks like a cadastre", "purely numerical", "lots of kinship words". Anyone can do this.

Level 2 — Context

Research the provenance. What site is it from? What museum? Are there other khipus from the same place? Historians and archaeologists shine here.

Level 3 — Interpretation

Propose column names, identify the document type, cross-reference with colonial sources. Linguists and Andean specialists needed.

Level 4 — Reconstruction

Create the "library" Excel file — the khipu as the Inca would have written it in a spreadsheet. The expert level.

The workflow

khipu submit UR039                    # 1. Generate template
nano contributions/UR039.json         # 2. Add your analysis
git add contributions/UR039.json      # 3. Commit
# Submit a Pull Request                # 4. Community reviews

See CONTRIBUTING.md for the full guide.


Project structure

KhipuReader/
├── src/khipu_translator/    # Core translation engine
├── contributions/           # One JSON per analyzed khipu (community-built)
├── reproducibility/         # Validation and reproducibility scripts
├── library/                 # Reconstructed Excel files
├── tests/                   # Unit tests
├── PROGRESS.md              # Auto-generated progress report
├── CONTRIBUTING.md          # How to contribute
└── README.md

Reproducibility

All validation scripts referenced in the preprint (Sivan 2026) are in reproducibility/. Each number in the paper maps to exactly one script below.

Script Paper section What it does
build_kaikki_dictionary.py 2.5 Builds the 2,074-entry Kaikki-derived lexicon consumed by other scripts
brute_force_derivation.py 2.5 Exhaustive search of P(28,4) = 491,400 mappings on UR039; winner ma/ka/ta/pa (19/19, p = 2.0×10⁻⁶)
cross_validation.py 2.5 Top-3 candidate mappings tested on held-out UR050 + UR055
extension_analysis.py 2.6, 2.7 L9 q→pi reassignment and L2/L7/L8 positional polyphony gain/loss tallies
polyphony_context_test.py 2.10 Chi-square test of context-dependent polyphony (backs the L8-L7 bigram exception note)
negative_controls.py 2.8 Aymara control, pseudo-dictionary (N=100), length-preserving shuffle (N=5,000)
replication_ur112.py 3.3 Independent replication on UR112 against the curated ALBA glossary (125 higher, p < 0.001) + Figure 3
replication_ur112_kaikki.py 3.3 Parallel replication on UR112 against Kaikki 2,074 (212 higher, p = 0.0014)
corpus_coverage.py 3.2, 3.4 Corpus-wide lexical coverage of the frozen v3 syllabary on the 70 validated khipus
export_translations.py Suppl. Table 1 Generates cord_by_cord.csv (14,835 rows) and summary.csv for the 70 khipus

Data

reproducibility/data/ contains:

  • quechua_kaikki_2060.txt — frozen 2026-03-20 Kaikki extract (base lexicon, 2,060 entries)
  • quechua_kaikki_2074.txt — consolidated dictionary (2,074 entries) produced by build_kaikki_dictionary.py

reproducibility/cord_by_cord.csv — full cord-by-cord translation output for all validated khipus (Supplementary Table 1), regenerated by export_translations.py.

Running the scripts

# One-time: install the package and (re)build the Kaikki dictionary
pip install -e .
python reproducibility/build_kaikki_dictionary.py

# Derivation and cross-validation
python reproducibility/brute_force_derivation.py
python reproducibility/cross_validation.py
python reproducibility/extension_analysis.py
python reproducibility/polyphony_context_test.py

# Negative controls
python reproducibility/negative_controls.py

# Independent replication (both variants)
python reproducibility/replication_ur112.py
python reproducibility/replication_ur112_kaikki.py

# Corpus-wide coverage and cord-by-cord export
python reproducibility/corpus_coverage.py
python reproducibility/export_translations.py

On first run, the scripts automatically download the Open Khipu Repository database (~50 MB, requires git) into ~/.khipu-translator/.


Citation

@article{sivan2026khipu,
  title={The Khipu as a Layered Information System: Document Types, Metadata,
         and a Proposed Syllabic Content Channel},
  author={Sivan, Julien},
  journal={ALBA Project Preprint},
  year={2026},
  doi={10.5281/zenodo.19184002}
}

Data source

This tool reads khipus from the Open Khipu Repository (OKR):

Urton, G. & Brezine, C. (2009–2024). Open Khipu Repository. Harvard University. DOI: 10.5281/zenodo.5037551

The OKR database (619 khipus, ~50 MB) is downloaded automatically on first use.

Related projects

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

khipu_reader-0.1.0.tar.gz (242.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

khipu_reader-0.1.0-py3-none-any.whl (244.9 kB view details)

Uploaded Python 3

File details

Details for the file khipu_reader-0.1.0.tar.gz.

File metadata

  • Download URL: khipu_reader-0.1.0.tar.gz
  • Upload date:
  • Size: 242.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.9

File hashes

Hashes for khipu_reader-0.1.0.tar.gz
Algorithm Hash digest
SHA256 9d493da812f797825a462d8d83582c4d98e2e7d957e086a9a04f79ebd2fc241a
MD5 7521d48ec376bbbc03ad7fc2e810013a
BLAKE2b-256 829e78904d5b3d886a1de1458d362a82601f05727821709bac20954fa82c2ec2

See more details on using hashes here.

File details

Details for the file khipu_reader-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: khipu_reader-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 244.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.9

File hashes

Hashes for khipu_reader-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2928550d15864c3298cebd92a4afb15d49bc1f6c474f7d7013de0b1e1c24fd48
MD5 bbcd2e048ee97837ed50cdd595f773ca
BLAKE2b-256 c14549581bf2ced961b2bb0f6446f6710c68c414c68b620a384ca57f714351b5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page