Skip to main content

Open-source tools for ancient epigraphy โ€” built for Etruscan, designed to be copied.

Project description

๐Œ๐Œ๐Œ„๐Œ ๐Œ„๐Œ•๐Œ“๐Œ–๐Œ”๐Œ‚๐Œ€๐Œ

OpenEtruscan

Open-source tools for ancient epigraphy โ€” built for Etruscan, designed to be copied.

License: MIT Python 3.10+ CI PyPI Data: CC0

๐ŸŒ Live at openetruscan.com

Normalize ยท Search ยท Map ยท Export ยท Contribute


What Is This?

OpenEtruscan is a Python toolkit that solves the transcription chaos in Etruscan studies. The same word appears as 5+ incompatible forms across publications โ€” making cross-corpus search impossible.

We fix that. One pip install, zero servers, works offline forever.

from openetruscan import normalize

# Input in ANY transcription system
result = normalize("LARTHAL")       # CIE standard
result = normalize("Larฮธal")        # Philological
result = normalize("๐Œ“๐Œ€๐Œ“๐Œˆ๐Œ€๐Œ‹")  # Unicode Old Italic

# Always get the same canonical output
print(result.canonical)   # โ†’ "larฮธal"
print(result.phonetic)    # โ†’ "/lar.tสฐal/"
print(result.old_italic)  # โ†’ "๐Œ“๐Œ€๐Œ“๐Œˆ๐Œ€๐Œ‹"

Quick Start

pip install openetruscan

Normalize a text

openetruscan normalize "LARTHAL LECNES"

Batch process a file

openetruscan batch corpus.txt --format csv --output clean.csv

Validate encoding

openetruscan validate my_transcription.txt

Run the API + Web UI locally

pip install openetruscan[all]
uvicorn openetruscan.server:app --reload
# Open http://localhost:8000/web/index.html

Why?

Problem Today With OpenEtruscan
"Where else does this word appear?" Flip through 300 pages of print volumes corpus.search(text="*al lecn*")
"Is this spelling a dialect variant?" An entire journal article to pose the question One query, 30 seconds
"I need to publish my thesis data" Word doc, usable only by the author openetruscan batch thesis.txt --format epidoc โ†’ PR โ†’ global corpus
"How widespread was this clan?" Months of manual index-reading corpus.names.search(gens="lecne") โ†’ interactive map

Architecture

The core engine is language-agnostic. Each language is a YAML config file:

openetruscan/
โ”œโ”€โ”€ engine/          # Universal normalizer, parser, exporter
โ”œโ”€โ”€ adapters/
โ”‚   โ”œโ”€โ”€ etruscan.yaml    # Etruscan alphabet, phonotactics, names
โ”‚   โ”œโ”€โ”€ oscan.yaml       # Same engine, different YAML
โ”‚   โ”œโ”€โ”€ faliscan.yaml    # ... add any ancient script
โ”‚   โ””โ”€โ”€ YOUR_LANG.yaml   # Fork this pattern
โ”œโ”€โ”€ corpus/          # Structured dataset (Local SQLite / Cloud PostgreSQL)
โ”œโ”€โ”€ prosopography/   # NLP name parser + kinship graph + Neo4j export
โ””โ”€โ”€ exporters/       # EpiDoc XML, CSV, JSON-LD, GeoJSON

Want to support another language? Write 50 lines of YAML. The engine does the rest.

Web Interface

OpenEtruscan ships with three interconnected browser-based tools, live at openetruscan.com:

Page Description
Normalizer (index.html) Convert any Etruscan text between transcription systems in real time
Search (search.html) Full-text search across the corpus with clickable Clan network badges
Map (map.html) Interactive Leaflet map of inscription findspots across Etruria

Self-Hosting

OpenEtruscan is designed for easy self-hosting. Copy the docker-compose.yml and you're done:

git clone https://github.com/Eddy1919/openEtruscan.git
cd openEtruscan
cp .env.example .env    # Edit with your DATABASE_URL
docker compose up -d --build

See .env.example for all configuration options. By default, OpenEtruscan runs in zero-config mode with a bundled SQLite database โ€” no external database required.

Packages

Package Description Status
openetruscan (core) Normalizer + CLI + adapters โœ… Released
openetruscan[corpus] Structured dataset + query API โœ… Released
openetruscan[prosopography] NLP name parser + kinship graph โœ… Released
openetruscan[server] FastAPI backend + Web UI โœ… Released
openetruscan[all] Everything โœ… Released

Contributing

We welcome contributions from epigraphers, linguists, and developers alike. See CONTRIBUTING.md for full details.

Quick paths:

  • ๐Ÿ“ Add data โ€” submit new inscriptions via CSV or CLI
  • ๐Ÿ”ค Improve mappings โ€” fix transliteration rules in the YAML adapters
  • ๐ŸŒ Add a language โ€” copy a YAML adapter, fill in your script
  • ๐Ÿ› Report bugs โ€” open an issue with reproduction steps
  • ๐Ÿ’ป Write code โ€” fork, hack, PR
git clone https://github.com/Eddy1919/openEtruscan.git
cd openEtruscan
pip install -e ".[dev]"
pytest

Roadmap

โœ… Done

  • Normalizer engine โ€” auto-detect 5 transcription systems, fold to canonical, phonotactic validation
  • CLI โ€” normalize, batch, convert, validate, adapters commands
  • Multi-language adapters โ€” Etruscan, Oscan, Faliscan (23+ letters, 35+ names, equivalence classes)
  • Corpus database โ€” SQLite + PostgreSQL, 4,700+ inscriptions from Larth dataset
  • Prosopography engine โ€” NLP name parser, 633 clans, kinship graph, Neo4j Cypher + GraphML export
  • Web UI โ€” Normalizer, full-text Search, interactive Map with clan network visualization
  • Production deployment โ€” Docker + Nginx + HTTPS on openetruscan.com
  • CI/CD โ€” GitHub Actions (test on Python 3.10โ€“3.13, auto-deploy, PyPI publish)
  • 65 tests passing across all modules

๐Ÿ”œ Next

  • EpiDoc XML exporter โ€” interoperability with the digital classics ecosystem
  • Corpus CLI โ€” openetruscan search, openetruscan import, openetruscan export commands
  • Rhaetic + Lemnian adapters โ€” expand to the Tyrsenian language family

๐Ÿ—“๏ธ Planned

  • CLTK Etruscan module โ€” contribute to the Classical Language Toolkit
  • Linked Open Data โ€” publish to Pelagios/Pleiades gazetteers
  • Statistical tools โ€” letter frequency analysis, dialect clustering, dating heuristics

License

  • Code: MIT โ€” do whatever you want
  • Data: CC0 โ€” public domain, no restrictions

Acknowledgments

OpenEtruscan builds on decades of work by epigraphers and Etruscologists. We are especially grateful to:


Built for Etruscan. Designed to be copied.

๐Œ€ ๐Œ ๐Œ‚ ๐Œƒ ๐Œ„ ๐Œ… ๐Œ† ๐Œ‡ ๐Œˆ ๐Œ‰ ๐ŒŠ ๐Œ‹ ๐ŒŒ ๐Œ ๐ŒŽ ๐Œ ๐Œ ๐Œ‘ ๐Œ“ ๐Œ” ๐Œ• ๐Œ– ๐Œ— ๐Œ˜ ๐Œ™ ๐Œš

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openetruscan-0.2.0.tar.gz (73.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

openetruscan-0.2.0-py3-none-any.whl (41.5 kB view details)

Uploaded Python 3

File details

Details for the file openetruscan-0.2.0.tar.gz.

File metadata

  • Download URL: openetruscan-0.2.0.tar.gz
  • Upload date:
  • Size: 73.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for openetruscan-0.2.0.tar.gz
Algorithm Hash digest
SHA256 c5d20a4e6c7db82fadbd3e3b78b6c748266cc57e999e0ab8bb9d21677a236e4f
MD5 d2589ae46bb31c618ad7166039fc190e
BLAKE2b-256 caa6718a9c683c85d5ad5ac395bf59de37a5be9ebb29410a36207c2bbed6a695

See more details on using hashes here.

Provenance

The following attestation bundles were made for openetruscan-0.2.0.tar.gz:

Publisher: workflow.yaml on Eddy1919/openEtruscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file openetruscan-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: openetruscan-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 41.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for openetruscan-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4da2432de4dfe22ef54106ed6ef85b0f03d9444f5b2522f686283e605dc4b3a8
MD5 8a2ba9131a82797f2254d86a27cd6c71
BLAKE2b-256 0dfac3709d0dd7d2526f8cc679aebf9ab76c89d78df615fb4a46a9ee327f577d

See more details on using hashes here.

Provenance

The following attestation bundles were made for openetruscan-0.2.0-py3-none-any.whl:

Publisher: workflow.yaml on Eddy1919/openEtruscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page