framework for detecting old loanwords
Project description
loanpy is a tool for historical linguists. It extracts sound changes and constraints from an etymological dictionary, generates pseudo-roots for L1, pseudo- sound-substitutions for L2, searches for phonetically identical lexemes and ranks those according to semantic similarity.
Installation
$ python -m pip install loanpy
Getting started
>>> from loanpy import loanfinder as lf
Download and unpack 3 Gigabytes of pretrained Google-News vectors. Move GoogleNews-vectors-negative300.bin to the folder “data”, the full path to which can be retrieved via:
>>> import os
>>> print(os.path.dirname(lf.__file__)+r"\data")
Following code will compare a set of Gothic words (data/dfgot.csv) with Hungarian words (data/zaicz.csv) and evaluate which elements are the most likely candidates for loanwords. The result can be viewed in data/results/matches.csv:
>>> lf.loandf()
Data Sources
Gábor Zaicz’s Hungarian etymological dictionary from 2006
Gerhard Köbler’s Gothic database
Hungarian Academy of Science’s online version of Uralisches Etymologisches Wörterbuch
License
Academic Free License (AFL) (Creative Commons Attribution 4.0 International)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.