Skip to main content

Open-source tools for ancient epigraphy โ€” built for Etruscan, designed to be copied.

Project description

๐Œ๐Œ๐Œ„๐Œ ๐Œ„๐Œ•๐Œ“๐Œ–๐Œ”๐Œ‚๐Œ€๐Œ

OpenEtruscan

Open-source tools for ancient epigraphy โ€” built for Etruscan, designed to be copied.

License: MIT Python 3.10+ Data: CC0

Normalize ยท Search ยท Export ยท Contribute


What Is This?

OpenEtruscan is a Python toolkit that solves the transcription chaos in Etruscan studies. The same word appears as 5+ incompatible forms across publications โ€” making cross-corpus search impossible.

We fix that. One pip install, zero servers, works offline forever.

from openetruscan import normalize

# Input in ANY transcription system
result = normalize("LARTHAL")       # CIE standard
result = normalize("Larฮธal")        # Philological
result = normalize("๐Œ“๐Œ€๐Œ“๐Œˆ๐Œ€๐Œ‹")  # Unicode Old Italic

# Always get the same canonical output
print(result.canonical)   # โ†’ "larฮธal"
print(result.phonetic)    # โ†’ "/lar.tสฐal/"
print(result.old_italic)  # โ†’ "๐Œ“๐Œ€๐Œ“๐Œˆ๐Œ€๐Œ‹"

Quick Start

pip install openetruscan

Normalize a text

openetruscan normalize "LARTHAL LECNES"

Batch process a file

openetruscan batch corpus.txt --format csv --output clean.csv

Validate encoding

openetruscan validate my_transcription.txt

Why?

Problem Today With OpenEtruscan
"Where else does this word appear?" Flip through 300 pages of print volumes corpus.search(text="*al lecn*")
"Is this spelling a dialect variant?" An entire journal article to pose the question One query, 30 seconds
"I need to publish my thesis data" Word doc, usable only by the author openetruscan batch thesis.txt --format epidoc โ†’ PR โ†’ global corpus
"How widespread was this clan?" Months of manual index-reading corpus.names.search(gens="lecne") โ†’ map

Architecture

The core engine is language-agnostic. Each language is a YAML config file:

openetruscan/
โ”œโ”€โ”€ engine/          # Universal normalizer, parser, exporter
โ”œโ”€โ”€ adapters/
โ”‚   โ”œโ”€โ”€ etruscan.yaml    # Etruscan alphabet, phonotactics, names
โ”‚   โ”œโ”€โ”€ oscan.yaml       # Same engine, different YAML
โ”‚   โ”œโ”€โ”€ rhaetic.yaml     # ... add any ancient script
โ”‚   โ””โ”€โ”€ YOUR_LANG.yaml   # Fork this pattern
โ”œโ”€โ”€ corpus/          # Structured dataset (SQLite, Git-native)
โ”œโ”€โ”€ prosopography/   # Name parser + kinship graph
โ””โ”€โ”€ exporters/       # EpiDoc XML, CSV, JSON-LD, GeoJSON

Want to support another language? Write 50 lines of YAML. The engine does the rest.

Zero Infrastructure

  • No servers. No databases. No grants.
  • Data ships inside the pip package (SQLite).
  • Web tools run as static HTML (GitHub Pages).
  • Updates via pip install --upgrade.
  • Total hosting cost: $0/month. Forever.

Contributing

Add Data

Found a new inscription? Have a dissertation corpus?

  1. Fork this repo
  2. Add entries to data/contributions/your_name.csv
  3. Run openetruscan validate data/contributions/your_name.csv
  4. Open a Pull Request
  5. CI validates encoding + duplicates
  6. We merge โ†’ your data is in the next release

Your name stays in the Git history. Your discovery becomes searchable worldwide.

Add a Language

  1. Copy src/openetruscan/adapters/etruscan.yaml
  2. Fill in your language's alphabet, variants, phonotactics
  3. Run pytest to verify
  4. Open a Pull Request

Improve the Code

git clone https://github.com/open-etruscan/openetruscan.git
cd openetruscan
pip install -e ".[dev]"
pytest

See CONTRIBUTING.md for details.

Packages

Package Description Status
openetruscan (core) Normalizer + CLI + adapters โœ… Released
openetruscan[corpus] Structured dataset + query API โœ… Released
openetruscan[prosopography] Name parser + kinship graph โœ… Released
openetruscan[all] Everything โœ… Released

Roadmap

โœ… Done

  • Normalizer engine โ€” auto-detect 5 transcription systems, fold to canonical, phonotactic validation
  • CLI โ€” normalize, batch, convert, validate, adapters commands
  • Etruscan adapter โ€” 23 letters, 35+ known names, equivalence classes
  • Corpus database โ€” SQLite-backed, 4,700+ inscriptions from Larth dataset
  • Prosopography โ€” name parser, 633 clans, kinship graph, GraphML/JSON export
  • Web converter โ€” static HTML/CSS/JS, runs in any browser, zero backend
  • GitHub Actions โ€” CI (Python 3.10-3.13 + Ruff), Pages deploy, PyPI publish
  • 64 tests passing across all modules

๐Ÿ”œ Next (v0.2)

  • Faliscan + Oscan adapters โ€” prove the multi-language architecture (one YAML each)
  • Web language selector โ€” switch between languages in the web converter
  • GeoJSON map viewer โ€” static HTML page showing inscription findspots on an interactive map
  • EpiDoc XML exporter โ€” interoperability with the digital classics ecosystem
  • PyPI release โ€” first public pip install openetruscan
  • Corpus CLI โ€” openetruscan search, openetruscan import, openetruscan export commands

๐Ÿ—“๏ธ Planned (v0.3)

  • CLTK Etruscan module โ€” contribute to the Classical Language Toolkit
  • Linked Open Data โ€” publish to Pelagios/Pleiades gazetteers
  • Statistical tools โ€” letter frequency analysis, dialect clustering, dating heuristics
  • Web search interface โ€” search the corpus from the browser (static, no backend)
  • Rhaetic + Lemnian adapters โ€” expand to the Tyrsenian language family

๐ŸŒ Vision

  • Community data contributions โ€” scholars submit inscriptions via PR, CI validates
  • Cross-language scholar search โ€” query across Etruscan, Oscan, Faliscan from one interface
  • Academic citation โ€” dataset cited in peer-reviewed publications
  • Template repo โ€” one-click fork to set up tools for any underdocumented ancient script

License

  • Code: MIT โ€” do whatever you want
  • Data: CC0 โ€” public domain, no restrictions

Acknowledgments

OpenEtruscan builds on decades of work by epigraphers and Etruscologists. We are especially grateful to:


Built for Etruscan. Designed to be copied.

๐Œ€ ๐Œ ๐Œ‚ ๐Œƒ ๐Œ„ ๐Œ… ๐Œ† ๐Œ‡ ๐Œˆ ๐Œ‰ ๐ŒŠ ๐Œ‹ ๐ŒŒ ๐Œ ๐ŒŽ ๐Œ ๐Œ ๐Œ‘ ๐Œ“ ๐Œ” ๐Œ• ๐Œ– ๐Œ— ๐Œ˜ ๐Œ™ ๐Œš

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openetruscan-0.1.0.tar.gz (40.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

openetruscan-0.1.0-py3-none-any.whl (31.8 kB view details)

Uploaded Python 3

File details

Details for the file openetruscan-0.1.0.tar.gz.

File metadata

  • Download URL: openetruscan-0.1.0.tar.gz
  • Upload date:
  • Size: 40.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for openetruscan-0.1.0.tar.gz
Algorithm Hash digest
SHA256 1ceaa4c800b52581744aefe8bf2381ea7bf19ae928874901098c2647e0dda330
MD5 1a120f1ad0157d75ec45847ae571b794
BLAKE2b-256 d330c3dd7e3eac0dab517456bb4504249330bb16a6ab6937d786662187b93026

See more details on using hashes here.

Provenance

The following attestation bundles were made for openetruscan-0.1.0.tar.gz:

Publisher: workflow.yaml on Eddy1919/openEtruscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file openetruscan-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: openetruscan-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 31.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for openetruscan-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3781829df0fb50de86f8ffa41257b1d5b6668fe532e121ffc591ffffa83fd0d3
MD5 677639e9627f09e79c0f8b18f6be0b50
BLAKE2b-256 aa0a7c6080ae120d958b25d4d7b889a4c53c09ec9ad0f61d8ca1f3d8c97b1f14

See more details on using hashes here.

Provenance

The following attestation bundles were made for openetruscan-0.1.0-py3-none-any.whl:

Publisher: workflow.yaml on Eddy1919/openEtruscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page