Open-source tools for ancient epigraphy โ built for Etruscan, designed to be copied.
Project description
๐๐๐๐ ๐๐๐๐๐๐๐๐
OpenEtruscan
Open-source tools for ancient epigraphy โ built for Etruscan, designed to be copied.
Normalize ยท Search ยท Export ยท Contribute
What Is This?
OpenEtruscan is a Python toolkit that solves the transcription chaos in Etruscan studies. The same word appears as 5+ incompatible forms across publications โ making cross-corpus search impossible.
We fix that. One pip install, zero servers, works offline forever.
from openetruscan import normalize
# Input in ANY transcription system
result = normalize("LARTHAL") # CIE standard
result = normalize("Larฮธal") # Philological
result = normalize("๐๐๐๐๐๐") # Unicode Old Italic
# Always get the same canonical output
print(result.canonical) # โ "larฮธal"
print(result.phonetic) # โ "/lar.tสฐal/"
print(result.old_italic) # โ "๐๐๐๐๐๐"
Quick Start
pip install openetruscan
Normalize a text
openetruscan normalize "LARTHAL LECNES"
Batch process a file
openetruscan batch corpus.txt --format csv --output clean.csv
Validate encoding
openetruscan validate my_transcription.txt
Why?
| Problem | Today | With OpenEtruscan |
|---|---|---|
| "Where else does this word appear?" | Flip through 300 pages of print volumes | corpus.search(text="*al lecn*") |
| "Is this spelling a dialect variant?" | An entire journal article to pose the question | One query, 30 seconds |
| "I need to publish my thesis data" | Word doc, usable only by the author | openetruscan batch thesis.txt --format epidoc โ PR โ global corpus |
| "How widespread was this clan?" | Months of manual index-reading | corpus.names.search(gens="lecne") โ map |
Architecture
The core engine is language-agnostic. Each language is a YAML config file:
openetruscan/
โโโ engine/ # Universal normalizer, parser, exporter
โโโ adapters/
โ โโโ etruscan.yaml # Etruscan alphabet, phonotactics, names
โ โโโ oscan.yaml # Same engine, different YAML
โ โโโ rhaetic.yaml # ... add any ancient script
โ โโโ YOUR_LANG.yaml # Fork this pattern
โโโ corpus/ # Structured dataset (SQLite, Git-native)
โโโ prosopography/ # Name parser + kinship graph
โโโ exporters/ # EpiDoc XML, CSV, JSON-LD, GeoJSON
Want to support another language? Write 50 lines of YAML. The engine does the rest.
Zero Infrastructure
- No servers. No databases. No grants.
- Data ships inside the
pippackage (SQLite). - Web tools run as static HTML (GitHub Pages).
- Updates via
pip install --upgrade. - Total hosting cost: $0/month. Forever.
Contributing
Add Data
Found a new inscription? Have a dissertation corpus?
- Fork this repo
- Add entries to
data/contributions/your_name.csv - Run
openetruscan validate data/contributions/your_name.csv - Open a Pull Request
- CI validates encoding + duplicates
- We merge โ your data is in the next release
Your name stays in the Git history. Your discovery becomes searchable worldwide.
Add a Language
- Copy
src/openetruscan/adapters/etruscan.yaml - Fill in your language's alphabet, variants, phonotactics
- Run
pytestto verify - Open a Pull Request
Improve the Code
git clone https://github.com/open-etruscan/openetruscan.git
cd openetruscan
pip install -e ".[dev]"
pytest
See CONTRIBUTING.md for details.
Packages
| Package | Description | Status |
|---|---|---|
openetruscan (core) |
Normalizer + CLI + adapters | โ Released |
openetruscan[corpus] |
Structured dataset + query API | โ Released |
openetruscan[prosopography] |
Name parser + kinship graph | โ Released |
openetruscan[all] |
Everything | โ Released |
Roadmap
โ Done
- Normalizer engine โ auto-detect 5 transcription systems, fold to canonical, phonotactic validation
- CLI โ
normalize,batch,convert,validate,adapterscommands - Etruscan adapter โ 23 letters, 35+ known names, equivalence classes
- Corpus database โ SQLite-backed, 4,700+ inscriptions from Larth dataset
- Prosopography โ name parser, 633 clans, kinship graph, GraphML/JSON export
- Web converter โ static HTML/CSS/JS, runs in any browser, zero backend
- GitHub Actions โ CI (Python 3.10-3.13 + Ruff), Pages deploy, PyPI publish
- 64 tests passing across all modules
๐ Next (v0.2)
- Faliscan + Oscan adapters โ prove the multi-language architecture (one YAML each)
- Web language selector โ switch between languages in the web converter
- GeoJSON map viewer โ static HTML page showing inscription findspots on an interactive map
- EpiDoc XML exporter โ interoperability with the digital classics ecosystem
- PyPI release โ first public
pip install openetruscan - Corpus CLI โ
openetruscan search,openetruscan import,openetruscan exportcommands
๐๏ธ Planned (v0.3)
- CLTK Etruscan module โ contribute to the Classical Language Toolkit
- Linked Open Data โ publish to Pelagios/Pleiades gazetteers
- Statistical tools โ letter frequency analysis, dialect clustering, dating heuristics
- Web search interface โ search the corpus from the browser (static, no backend)
- Rhaetic + Lemnian adapters โ expand to the Tyrsenian language family
๐ Vision
- Community data contributions โ scholars submit inscriptions via PR, CI validates
- Cross-language scholar search โ query across Etruscan, Oscan, Faliscan from one interface
- Academic citation โ dataset cited in peer-reviewed publications
- Template repo โ one-click fork to set up tools for any underdocumented ancient script
License
Acknowledgments
OpenEtruscan builds on decades of work by epigraphers and Etruscologists. We are especially grateful to:
- The compilers of the Corpus Inscriptionum Etruscarum
- The Etruscan Texts Project (UMass Amherst)
- The Larth Dataset (Vico & Spanakis, 2023)
- The EpiDoc community
- The Classical Language Toolkit
Built for Etruscan. Designed to be copied.
๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file openetruscan-0.1.0.tar.gz.
File metadata
- Download URL: openetruscan-0.1.0.tar.gz
- Upload date:
- Size: 40.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1ceaa4c800b52581744aefe8bf2381ea7bf19ae928874901098c2647e0dda330
|
|
| MD5 |
1a120f1ad0157d75ec45847ae571b794
|
|
| BLAKE2b-256 |
d330c3dd7e3eac0dab517456bb4504249330bb16a6ab6937d786662187b93026
|
Provenance
The following attestation bundles were made for openetruscan-0.1.0.tar.gz:
Publisher:
workflow.yaml on Eddy1919/openEtruscan
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
openetruscan-0.1.0.tar.gz -
Subject digest:
1ceaa4c800b52581744aefe8bf2381ea7bf19ae928874901098c2647e0dda330 - Sigstore transparency entry: 1155567906
- Sigstore integration time:
-
Permalink:
Eddy1919/openEtruscan@8b5f0767dc4d765237a02d57c3f13cba164312fe -
Branch / Tag:
refs/tags/0.1.0 - Owner: https://github.com/Eddy1919
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
workflow.yaml@8b5f0767dc4d765237a02d57c3f13cba164312fe -
Trigger Event:
release
-
Statement type:
File details
Details for the file openetruscan-0.1.0-py3-none-any.whl.
File metadata
- Download URL: openetruscan-0.1.0-py3-none-any.whl
- Upload date:
- Size: 31.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3781829df0fb50de86f8ffa41257b1d5b6668fe532e121ffc591ffffa83fd0d3
|
|
| MD5 |
677639e9627f09e79c0f8b18f6be0b50
|
|
| BLAKE2b-256 |
aa0a7c6080ae120d958b25d4d7b889a4c53c09ec9ad0f61d8ca1f3d8c97b1f14
|
Provenance
The following attestation bundles were made for openetruscan-0.1.0-py3-none-any.whl:
Publisher:
workflow.yaml on Eddy1919/openEtruscan
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
openetruscan-0.1.0-py3-none-any.whl -
Subject digest:
3781829df0fb50de86f8ffa41257b1d5b6668fe532e121ffc591ffffa83fd0d3 - Sigstore transparency entry: 1155567914
- Sigstore integration time:
-
Permalink:
Eddy1919/openEtruscan@8b5f0767dc4d765237a02d57c3f13cba164312fe -
Branch / Tag:
refs/tags/0.1.0 - Owner: https://github.com/Eddy1919
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
workflow.yaml@8b5f0767dc4d765237a02d57c3f13cba164312fe -
Trigger Event:
release
-
Statement type: