Skip to main content

Linguistic reconstruction with LingPy

Project description

LingRex: Linguistic Reconstruction with LingPy

Build Status DOI PyPI version

LingRex offers the code needed for the automatic inference of sound correspondence patterns as described in the following paper:

List, J.-M. (2019): Automatic inference of sound correspondence patterns across multiple languages. Computational Linguistics 45.1. 137-161. DOI: 10.1162/coli_a_00344

To test this workflow, please check the workflow code example in tests/workflows/list-2019.

LingRex offers also the code needed for a baseline algorithm for automatic word prediction or automatic phonological reconstruction in a supervised fashion.

List, J.-M. and R. Forkel and N. W. Hill (2022): A New Framework for Fast Automated Phonological Reconstruction Using Trimmed Alignments and Sound Correspondence Patterns. Proceedings of the 3rd International Workshop on Computational Approaches to Historical Language Change (LChange 2022). Dublin. Ireland. https://aclanthology.org/2022.lchange-1.9

This algorithm is also used as a baseline for a Shared Task on the Prediction of Cognate Reflexes (https://sigtyp.github.io/st2022.html), organized as part of the SIGTYP Workshop at NAACL 2022.

List, J.-M., E. Vylomova, R. Forkel, N. Hill, and R. Cotterell (2022): The SIGTYP shared task on the prediction of cognate reflexes. In: Proceedings of the 4th Workshop on Computational Typology and Multilingual NLP. Association for Computational Linguistics 52-62. https://aclanthology.org/2022.sigtyp-1.7

Methods for the handling of partial cognates were introduced in a study by Wu and List (2023):

Wu, M.-S. and J.-M. List (2023): Annotating cognates in phylogenetic studies of South-East Asian languages. Language Dynamics and Change. https://doi.org/10.1163/22105832-bja10023

Methods for the trimming of phonetic alignments were introduced in a study by Blum and List (2023):

Blum, F. and J.-M. List (2023): Trimming phonetic alignments improves the inference of sound correspondence patterns from multilingual wordlists. In: Proceedings of the 5th Workshop on Computational Typology and Multilingual NLP. Association for Computational Linguistics 52-64. https://aclanthology.org/2023.sigtyp-1.6.pdf

Methods for the handling and creation of fuzzy / uncertain phonological reconstructions were introduced in a study by List et al. (forthcoming):

List, J.-M.; Hill, N. W.; Blum, F.; and Forkel, R. (forthcoming): A New Framework for the Representation and Computation of Uncertainty in Phonological Reconstruction. To appear in: Proceedings of the 4th Workshop on Computational Approaches to Historical Language Change.

When using this package in your research, please make sure to quote the respective papers, depending on the algorithms you use, and quote the software package as follows:

List, J.-M. and R. Forkel (2023): LingRex: Linguistic Reconstruction with LingPy. [Computer software, Version 1.4.0]. With contributions by Frederic Blum and Mei-Shin Wu. Leipzig: Max Planck Institute for Evolutionary Anthropology. https://pypi.org/project/lingrex

Since this software package itself makes use of LingPy's alignment algorithms, you should also quote the LingPy package itself.

List, J.-M. and R. Forkel (2023): LingPy. A Python library for quantitative tasks in historical linguistics. Version 2.6.10. Max Planck Institute for Evolutionary Anthropology: Leipzig. https://lingpy.org

Installation

Install the package via pip:

pip install lingrex

Further Examples

The borrowing detection algorithm implemented in LingRex is introduced in the paper:

List, J.-M. and R. Forkel (2021): Automated identification of borrowings in multilingual wordlists [version 1; peer review: 3 approved, 1 approved with reservations]. Open Research Europe 1.79. 1-11. DOI: 10.12688/openreseurope.13843.1

If you use this algorithm, please cite LingRex and this paper.

In addition to the paper in which the correspondence pattern inference algorithm was first introduced, LingRex also offers the code to compute the workflow described in the following paper:

Wu, M.-S., N. Schweikhard, T. Bodt, N. Hill, and J.-M. List (2020): Computer-Assisted Language Comparison. State of the Art. Journal of Open Humanities Data 6.2. 1-14. DOI: 10.5334/johd.12

To test this workflow, please check the workflow code example in tests/workflows/wu-2020.

If you use this workflow in your work, please quote this paper as well.

In addition, our experiment (with T. Bodt) on predicting words with the help of sound correspondence patterns also made use of the LingRex package.

Bodt, T. and J.-M. List (2021): Reflex prediction. A case study of Western Kho-Bwa. Diachronica 0.0. 1-38. DOI: 10.1075/dia.20009.bod

To test this workflow, please check the workflow code example in tests/workflows/bodt-2019.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lingrex-1.4.1.tar.gz (39.9 kB view details)

Uploaded Source

Built Distribution

lingrex-1.4.1-py2.py3-none-any.whl (38.5 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file lingrex-1.4.1.tar.gz.

File metadata

  • Download URL: lingrex-1.4.1.tar.gz
  • Upload date:
  • Size: 39.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.4

File hashes

Hashes for lingrex-1.4.1.tar.gz
Algorithm Hash digest
SHA256 4352ccb57ee5760336e16c948092007121a34a8189ebcef17c108f6f399ea892
MD5 91f81ba121912e9c63c2f4796315e426
BLAKE2b-256 dbb370a582546ff2fc1f4899aca677e3d6294dc88657508299ddf40911f1cf60

See more details on using hashes here.

File details

Details for the file lingrex-1.4.1-py2.py3-none-any.whl.

File metadata

  • Download URL: lingrex-1.4.1-py2.py3-none-any.whl
  • Upload date:
  • Size: 38.5 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.4

File hashes

Hashes for lingrex-1.4.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 960a436a850b2c80b676ad50a6dec7a5199e9be96204df4d5fbb6a270be7a112
MD5 0fd1008a9b4330121f0e953682a26ee0
BLAKE2b-256 ab74eeb764a952079694948ce644d4997876da764f58108a5e4047b93b8a0a10

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page