Skip to main content

Web-Based Tool for Computer-Assisted Language Comparison

Project description

LingRex: Linguistic Reconstruction with LingPy

Build Status DOI PyPI version

LingRex offers the code needed for the automatic inference of sound correspondence patterns as described in the following paper:

List, J.-M. (2019): Automatic inference of sound correspondence patterns across multiple languages. Computational Linguistics 45.1. 137-161. DOI: 10.1162/coli_a_00344

To test this workflow, please check the workflow code example in tests/workflows/list-2019.

LingRex offers also the code needed for a baseline algorithm for automatic word prediction or automatic phonological reconstruction in a supervised fashion.

List, J.-M. and R. Forkel and N. W. Hill (2022): A New Framework for Fast Automated Phonological Reconstruction Using Trimmed Alignments and Sound Correspondence Patterns. Proceedings of the 3rd International Workshop on Computational Approaches to Historical Language Change (LChange 2022). Dublin. Ireland. https://aclanthology.org/2022.lchange-1.9

This algorithm is also used as a baseline for a Shared Task on the Prediction of Cognate Reflexes (https://sigtyp.github.io/st2022.html), organized as part of the SIGTYP Workshop at NAACL 2022.

List, J.-M., E. Vylomova, R. Forkel, N. Hill, and R. Cotterell (2022): The SIGTYP shared task on the prediction of cognate reflexes. In: Proceedings of the 4th Workshop on Computational Typology and Multilingual NLP. Association for Computational Linguistics 52-62. https://aclanthology.org/2022.sigtyp-1.7

Methods for the handling of partial cognates were introduced in a study by Wu and List (2023):

Wu, M.-S. and J.-M. List (2023): Annotating cognates in phylogenetic studies of South-East Asian languages. Language Dynamics and Change. https://doi.org/10.1163/22105832-bja10023

Methods for the trimming of phonetic alignments were introduced in a study by Blum and List (2023):

Blum, F. and J.-M. List (2023): Trimming phonetic alignments improves the inference of sound correspondence patterns from multilingual wordlists. In: Proceedings of the 5th Workshop on Computational Typology and Multilingual NLP. Association for Computational Linguistics. 52-64. https://aclanthology.org/2023.sigtyp-1.6.pdf

Methods for the handling and creation of fuzzy / uncertain phonological reconstructions were introduced in a study by List et al. (2023):

List, J.-M.; Hill, N. W.; Blum, F.; and Forkel, R. (2023): A New Framework for the Representation and Computation of Uncertainty in Phonological Reconstruction. Proceedings of the 4th Workshop on Computational Approaches to Historical Language Change. 22-32. https://aclanthology.org/2023.lchange-1.3

When using this package in your research, please make sure to quote the respective papers, depending on the algorithms you use, and quote the software package as follows:

List, J.-M. and R. Forkel (2023): LingRex: Linguistic Reconstruction with LingPy. [Computer software, Version 1.4.0]. With contributions by Frederic Blum and Mei-Shin Wu. Leipzig: Max Planck Institute for Evolutionary Anthropology. https://pypi.org/project/lingrex

Since this software package itself makes use of LingPy's alignment algorithms, you should also quote the LingPy package itself.

List, J.-M. and R. Forkel (2023): LingPy. A Python library for quantitative tasks in historical linguistics. Version 2.6.10. Max Planck Institute for Evolutionary Anthropology: Leipzig. https://lingpy.org

Installation

Install the package via pip:

pip install lingrex

Further Examples

The borrowing detection algorithm implemented in LingRex is introduced in the paper:

List, J.-M. and R. Forkel (2021): Automated identification of borrowings in multilingual wordlists [version 1; peer review: 3 approved, 1 approved with reservations]. Open Research Europe 1.79. 1-11. DOI: 10.12688/openreseurope.13843.1

If you use this algorithm, please cite LingRex and this paper.

In addition to the paper in which the correspondence pattern inference algorithm was first introduced, LingRex also offers the code to compute the workflow described in the following paper:

Wu, M.-S., N. Schweikhard, T. Bodt, N. Hill, and J.-M. List (2020): Computer-Assisted Language Comparison. State of the Art. Journal of Open Humanities Data 6.2. 1-14. DOI: 10.5334/johd.12

To test this workflow, please check the workflow code example in tests/workflows/wu-2020.

If you use this workflow in your work, please quote this paper as well.

In addition, our experiment (with T. Bodt) on predicting words with the help of sound correspondence patterns also made use of the LingRex package.

Bodt, T. and J.-M. List (2021): Reflex prediction. A case study of Western Kho-Bwa. Diachronica 0.0. 1-38. DOI: 10.1075/dia.20009.bod

To test this workflow, please check the workflow code example in tests/workflows/bodt-2019.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lingrex-1.4.2.tar.gz (40.6 kB view details)

Uploaded Source

Built Distribution

lingrex-1.4.2-py2.py3-none-any.whl (38.8 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file lingrex-1.4.2.tar.gz.

File metadata

  • Download URL: lingrex-1.4.2.tar.gz
  • Upload date:
  • Size: 40.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.3

File hashes

Hashes for lingrex-1.4.2.tar.gz
Algorithm Hash digest
SHA256 92d531b1677bac0fa4c69915f0be25fe8afe4350f157512557068dc80d674fe8
MD5 115007d218c41cabca8fa516ef99ffcb
BLAKE2b-256 c2b372c2e56871a7feaea547520a3c79827e8c7eca57552d62f578a1b2b9b335

See more details on using hashes here.

File details

Details for the file lingrex-1.4.2-py2.py3-none-any.whl.

File metadata

  • Download URL: lingrex-1.4.2-py2.py3-none-any.whl
  • Upload date:
  • Size: 38.8 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.3

File hashes

Hashes for lingrex-1.4.2-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 894d5ee59fef43ca615c318c787d36b4ab167264588a45b4a56935324bda28e6
MD5 78bc923afe23bbcd8a46895ace457c03
BLAKE2b-256 d0d2ec2c9b3e08a85ab5428eb7a9d199372d9a8218ed2a75ee17af318e4dbb5e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page