Skip to main content

Library for interfacing with data from the GLED project

Project description

Arca Verborum

Arca Verborum is a project to interface with the data from the GLED package.

The main available function is currently the one for performing weighted sampling of languages based on their phylogenetic distance, on their geographic distance accounting for areal effects (currently computed as a simple Haversine distance between the coordinates), and on the frequency among previous random samples.

When obtaining random samples for multiple iterations, it is strongly recommended to obtain all the samples in a single pass, so that the library can account for the potential oversampling of languages belonging to outgroups.

Note that the loading of the distance matrices, particularly of the geographic one, can take up to a minute on slower machines.

See code for more documentation, as below.

>>> import arcaverborum
>>> sampler = arcaverborum.GLED_Sampler()
WARNING:root:Loading the phylogenetic matrix from GLED...
WARNING:root:Loading the geographic matrix from GLED...
WARNING:root:Rescaling the phylogenetic matrix...
WARNING:root:Rescaling the geographic matrix...
>>> for idx, langset in enumerate(sampler.sample(4, 10)):
...   print(idx, langset)
... 
0 ('TlamacazapaNahuatl_tlam1239', 'GaviaoDoJiparana_gavi1246', 'Tubar_tuba1279', 'Pei_peii1238')
1 ('IslandCarib_isla1278', 'Samburu_samb1315', 'Dahalo_daha1245', 'Potawatomi_pota1247')
2 ('VlaxRomani_vlax1238', 'Gwahatike_gwah1244', 'NezPerce_nezp1238', 'Kwakwala_kwak1269')
3 ('AnaTingaDogon_anat1248', 'Zulgo-Gemzek_zulg1242', 'SkoltSaami_skol1241', 'Xokleng_xokl1240')
4 ('Mangarrayi_mang1381', 'Narak_nara1264', 'Matses_mats1244', 'Ionic-AtticAncientGreek_anci1242')
5 ('Jeli_jeri1242', 'Burum-Mindik_buru1306', 'Kistane_kist1241', 'Bongo_bong1285')
6 ('Patwin_patw1250', 'WesternTamang_west2415', 'Kapori_kapo1250', 'Sakha_yaku1245')
7 ('Kuy_kuyy1240', 'Kistane_kist1241', 'Kuruaya_kuru1309', 'Bolivar-NorthChimborazoHighlandQuichua_chim1302')
8 ('Betaf_beta1253', 'Bargam_barg1252', 'Pengo_peng1244', 'Wuding-LuquanYi_wudi1238')
9 ('NuclearWintu_nucl1651', 'Munit_muni1257', 'Nyawaygi_nyaw1247', 'MadaCameroon_mada1293')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arcaverborum-0.2.tar.gz (96.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

arcaverborum-0.2-py3-none-any.whl (97.0 MB view details)

Uploaded Python 3

File details

Details for the file arcaverborum-0.2.tar.gz.

File metadata

  • Download URL: arcaverborum-0.2.tar.gz
  • Upload date:
  • Size: 96.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.1

File hashes

Hashes for arcaverborum-0.2.tar.gz
Algorithm Hash digest
SHA256 f0547c57703f554df9090b78531a90525b982b731c78f71ff199fa56b2179b37
MD5 271e65aadca9836c42ea22b6bc3297ff
BLAKE2b-256 1dcba42e8a6d53f2bc43e7d5d38f0162137628ee74485c5eadf83ee9b1f3d51c

See more details on using hashes here.

File details

Details for the file arcaverborum-0.2-py3-none-any.whl.

File metadata

  • Download URL: arcaverborum-0.2-py3-none-any.whl
  • Upload date:
  • Size: 97.0 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.1

File hashes

Hashes for arcaverborum-0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 5e58d86916b2b8d3b0762c7566f52f5e097a376d2958314e0e9a6bfc58e35ed0
MD5 a48d183e34c54858183434982353b36e
BLAKE2b-256 ca3110d860a35c2e6e1ee9ca9967a0d425815161ed11157f7830c254241fbb52

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page