Open-source tools for ancient epigraphy โ built for Etruscan, designed to be copied.
Project description
๐๐๐๐ ๐๐๐๐๐๐๐๐
OpenEtruscan
Open-source tools for ancient epigraphy โ built for Etruscan, designed to be copied.
๐ Live at openetruscan.com
Normalize ยท Search ยท Map ยท Export ยท Contribute
What Is This?
OpenEtruscan is a Python toolkit that solves the transcription chaos in Etruscan studies. The same word appears as 5+ incompatible forms across publications โ making cross-corpus search impossible.
We fix that. One pip install, zero servers, works offline forever.
from openetruscan import normalize
# Input in ANY transcription system
result = normalize("LARTHAL") # CIE standard
result = normalize("Larฮธal") # Philological
result = normalize("๐๐๐๐๐๐") # Unicode Old Italic
# Always get the same canonical output
print(result.canonical) # โ "larฮธal"
print(result.phonetic) # โ "/lar.tสฐal/"
print(result.old_italic) # โ "๐๐๐๐๐๐"
Quick Start
pip install openetruscan
Normalize a text
openetruscan normalize "LARTHAL LECNES"
Batch process a file
openetruscan batch corpus.txt --format csv --output clean.csv
Validate encoding
openetruscan validate my_transcription.txt
Run the API + Web UI locally
pip install openetruscan[all]
uvicorn openetruscan.server:app --reload
# Open http://localhost:8000/web/index.html
Why?
| Problem | Today | With OpenEtruscan |
|---|---|---|
| "Where else does this word appear?" | Flip through 300 pages of print volumes | corpus.search(text="*al lecn*") |
| "Is this spelling a dialect variant?" | An entire journal article to pose the question | One query, 30 seconds |
| "I need to publish my thesis data" | Word doc, usable only by the author | openetruscan batch thesis.txt --format epidoc โ PR โ global corpus |
| "How widespread was this clan?" | Months of manual index-reading | corpus.names.search(gens="lecne") โ interactive map |
Architecture
The core engine is language-agnostic. Each language is a YAML config file:
openetruscan/
โโโ engine/ # Universal normalizer, parser, exporter
โโโ adapters/
โ โโโ etruscan.yaml # Etruscan alphabet, phonotactics, names
โ โโโ oscan.yaml # Same engine, different YAML
โ โโโ faliscan.yaml # ... add any ancient script
โ โโโ YOUR_LANG.yaml # Fork this pattern
โโโ corpus/ # Structured dataset (Local SQLite / Cloud PostgreSQL)
โโโ prosopography/ # NLP name parser + kinship graph + Neo4j export
โโโ exporters/ # EpiDoc XML, CSV, JSON-LD, GeoJSON
Want to support another language? Write 50 lines of YAML. The engine does the rest.
Web Interface
OpenEtruscan ships with three interconnected browser-based tools, live at openetruscan.com:
| Page | Description |
|---|---|
Normalizer (index.html) |
Convert any Etruscan text between transcription systems in real time |
Search (search.html) |
Full-text search across the corpus with clickable Clan network badges |
Map (map.html) |
Interactive Leaflet map of inscription findspots across Etruria |
Self-Hosting
OpenEtruscan is designed for easy self-hosting. Copy the docker-compose.yml and you're done:
git clone https://github.com/Eddy1919/openEtruscan.git
cd openEtruscan
cp .env.example .env # Edit with your DATABASE_URL
docker compose up -d --build
See .env.example for all configuration options. By default, OpenEtruscan runs in zero-config mode with a bundled SQLite database โ no external database required.
Packages
| Package | Description | Status |
|---|---|---|
openetruscan (core) |
Normalizer + CLI + adapters | โ Released |
openetruscan[corpus] |
Structured dataset + query API | โ Released |
openetruscan[prosopography] |
NLP name parser + kinship graph | โ Released |
openetruscan[server] |
FastAPI backend + Web UI | โ Released |
openetruscan[all] |
Everything | โ Released |
Contributing
We welcome contributions from epigraphers, linguists, and developers alike. See CONTRIBUTING.md for full details.
Quick paths:
- ๐ Add data โ submit new inscriptions via CSV or CLI
- ๐ค Improve mappings โ fix transliteration rules in the YAML adapters
- ๐ Add a language โ copy a YAML adapter, fill in your script
- ๐ Report bugs โ open an issue with reproduction steps
- ๐ป Write code โ fork, hack, PR
git clone https://github.com/Eddy1919/openEtruscan.git
cd openEtruscan
pip install -e ".[dev]"
pytest
Roadmap
โ Done
- Normalizer engine โ auto-detect 5 transcription systems, fold to canonical, phonotactic validation
- CLI โ
normalize,batch,convert,validate,adapterscommands - Multi-language adapters โ Etruscan, Oscan, Faliscan (23+ letters, 35+ names, equivalence classes)
- Corpus database โ SQLite + PostgreSQL, 4,700+ inscriptions from Larth dataset
- Prosopography engine โ NLP name parser, 633 clans, kinship graph, Neo4j Cypher + GraphML export
- Web UI โ Normalizer, full-text Search, interactive Map with clan network visualization
- Production deployment โ Docker + Nginx + HTTPS on openetruscan.com
- CI/CD โ GitHub Actions (test on Python 3.10โ3.13, auto-deploy, PyPI publish)
- 65 tests passing across all modules
๐ Next
- EpiDoc XML exporter โ interoperability with the digital classics ecosystem
- Corpus CLI โ
openetruscan search,openetruscan import,openetruscan exportcommands - Rhaetic + Lemnian adapters โ expand to the Tyrsenian language family
๐๏ธ Planned
- CLTK Etruscan module โ contribute to the Classical Language Toolkit
- Linked Open Data โ publish to Pelagios/Pleiades gazetteers
- Statistical tools โ letter frequency analysis, dialect clustering, dating heuristics
License
Acknowledgments
OpenEtruscan builds on decades of work by epigraphers and Etruscologists. We are especially grateful to:
- The compilers of the Corpus Inscriptionum Etruscarum
- The Etruscan Texts Project (UMass Amherst)
- The Larth Dataset (Vico & Spanakis, 2023)
- The EpiDoc community
- The Classical Language Toolkit
Built for Etruscan. Designed to be copied.
๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file openetruscan-0.2.0.tar.gz.
File metadata
- Download URL: openetruscan-0.2.0.tar.gz
- Upload date:
- Size: 73.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c5d20a4e6c7db82fadbd3e3b78b6c748266cc57e999e0ab8bb9d21677a236e4f
|
|
| MD5 |
d2589ae46bb31c618ad7166039fc190e
|
|
| BLAKE2b-256 |
caa6718a9c683c85d5ad5ac395bf59de37a5be9ebb29410a36207c2bbed6a695
|
Provenance
The following attestation bundles were made for openetruscan-0.2.0.tar.gz:
Publisher:
workflow.yaml on Eddy1919/openEtruscan
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
openetruscan-0.2.0.tar.gz -
Subject digest:
c5d20a4e6c7db82fadbd3e3b78b6c748266cc57e999e0ab8bb9d21677a236e4f - Sigstore transparency entry: 1165555718
- Sigstore integration time:
-
Permalink:
Eddy1919/openEtruscan@9176c621931ad56cadc1fb8210dd58ceb51954f6 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/Eddy1919
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
workflow.yaml@9176c621931ad56cadc1fb8210dd58ceb51954f6 -
Trigger Event:
release
-
Statement type:
File details
Details for the file openetruscan-0.2.0-py3-none-any.whl.
File metadata
- Download URL: openetruscan-0.2.0-py3-none-any.whl
- Upload date:
- Size: 41.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4da2432de4dfe22ef54106ed6ef85b0f03d9444f5b2522f686283e605dc4b3a8
|
|
| MD5 |
8a2ba9131a82797f2254d86a27cd6c71
|
|
| BLAKE2b-256 |
0dfac3709d0dd7d2526f8cc679aebf9ab76c89d78df615fb4a46a9ee327f577d
|
Provenance
The following attestation bundles were made for openetruscan-0.2.0-py3-none-any.whl:
Publisher:
workflow.yaml on Eddy1919/openEtruscan
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
openetruscan-0.2.0-py3-none-any.whl -
Subject digest:
4da2432de4dfe22ef54106ed6ef85b0f03d9444f5b2522f686283e605dc4b3a8 - Sigstore transparency entry: 1165555790
- Sigstore integration time:
-
Permalink:
Eddy1919/openEtruscan@9176c621931ad56cadc1fb8210dd58ceb51954f6 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/Eddy1919
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
workflow.yaml@9176c621931ad56cadc1fb8210dd58ceb51954f6 -
Trigger Event:
release
-
Statement type: