Skip to main content

AGILe, Ancient Greek Inscriptions Lemmatizer

Project description

AGILe: Ancient Greek Inscriptions Lemmatizer

AGILe is a lemmatizer for Ancient Greek inscriptions developed at the University of Groningen. Details can be found in:

de Graaf, E., Stopponi, S., Bos, J., Peels-Matthey, S. & Nissim, M. (2022). AGILe: The First Lemmatizer for Ancient Greek Inscriptions. Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022), Marseille, 20-25 June 2022. pp. 5334–5344. http://www.lrec-conf.org/proceedings/lrec2022/pdf/2022.lrec-1.571.pdf

Peels-Matthey, S., de Graaf, E., Nissim, M., Bos, J. & Stopponi, S. (2024). Automatic lemmatization of ancient Greek inscriptions: A presentation of AGILe. "Journal of epigraphic studies". 7, 2024: 29-50. https://pure.rug.nl/ws/portalfiles/portal/1054237044/Peels-Matthey_et_al_2024_Automatic_lemmatization_of_Ancient_Greek_inscriptions_-_A_presentationof_AGILe.pdf

Installation

1. Clone this repository

git clone https://github.com/agile-gronlp/agile

2. Install dependencies

AGILe works with version 1.0.21 of the CLTK. If you are using a more recent version of the CLTK, please install the required packages in a virtual environment.

AGILe supports Python 3.7 or later on POSIX–compliant operating systems. To install all required dependencies, simply run:

cd agile
pip install -r requirements.txt

3. Download Stanza models

To download the Ancient Greek models from Stanza, follow these steps in your Python interactive interpreter:

>>> import stanza
>>> stanza.download('grc')

Running AGILe

Below is an example of performing lemmatization on a short inscription:

>>> from agile import lemmatize

>>> doc = lemmatize("αἲξ θύεται τάδε μὴ ἐσφέρεν ἐς τὸ τέμενος τοῦ Ἀπόλλωνος τοῦ Οὐλίου εἱμάτιον")
>>> for sent in doc.sentences:
...    for word in sent.words:
...        print(f'word: {word.text + " ":15}lemma: {word.lemma}')

This demo gives the following output:

word: αἲξ            lemma: αἴξ
word: θύεται         lemma: θύω
word: τάδε           lemma: ὅδε
word: μὴ             lemma: μή
word: ἐσφέρεν        lemma: εἰσφέρω
word: ἐς             lemma: εἰς
word: τὸ             lemma: τε
word: τέμενος        lemma: τέμενος
word: τοῦ            lemma: ποῦ
word: Ἀπόλλωνος      lemma: Ἀπόλλων
word: τοῦ            lemma: ποῦ
word: Οὐλίου         lemma: οὔλιος
word: εἱμάτιον       lemma: ἱμάτιον

The lexicon lookup can be disabled by setting the use_lexicon parameter of the lemmatize function to False.

Interactive Notebook on Google Colab

If you want to try AGILe without downloading it: https://colab.research.google.com/drive/1YZMGxF8ORCrk_tyD1muHkgVsMXxeWHJJ?usp=drive_link

Acknowledgements

The lexicon.p used is extracted from a XML edition with composed Unicode of the LSJ, as transformed by Giuseppe G. A. Celano. The original text [is] provided under a CC BY-SA license by Perseus Digital Library, http://www.perseus.tufts.edu, with funding from The National Endowment for the Humanities. Data accessed from https://github.com/PerseusDL/lexica/.

License

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

BibTex

@InProceedings{degraaf-EtAl:2022:LREC,
  author    = {de Graaf, Evelien  and  Stopponi, Silvia  and  Bos, Jasper K.  and  Peels-Matthey, Saskia  and  Nissim, Malvina},
  title     = {AGILe: The First Lemmatizer for Ancient Greek Inscriptions},
  booktitle      = {Proceedings of the Language Resources and Evaluation Conference},
  month          = {June},
  year           = {2022},
  address        = {Marseille, France},
  publisher      = {European Language Resources Association},
  pages     = {5334--5344},
  url       = {https://aclanthology.org/2022.lrec-1.571}
}
@article{peels2024automatic,
  title={Automatic lemmatization of ancient Greek inscriptions: A presentation of AGILe},
  author={Peels-Matthey, Saskia and de Graaf, Evelien and Nissim, Malvina and Bos, Jasper and Stopponi, Silvia},
  journal={Journal of epigraphic studies: 7, 2024},
  pages={29--50},
  year={2024},
  publisher={Fabrizio Serra}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ancientgreek-0.2.1.tar.gz (647.7 kB view details)

Uploaded Source

File details

Details for the file ancientgreek-0.2.1.tar.gz.

File metadata

  • Download URL: ancientgreek-0.2.1.tar.gz
  • Upload date:
  • Size: 647.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.16

File hashes

Hashes for ancientgreek-0.2.1.tar.gz
Algorithm Hash digest
SHA256 6448c3d2b6ea0bbc6f4a87f6e2c3147c0a738e950ff60ab033bd15d0034ddd33
MD5 b9f6e04c3ce3e2d842522fb8ad3375f7
BLAKE2b-256 592a7d09eebfb4d2d6160c5fcc02b0791009a1e183572c50e73545436ed492c5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page