Skip to main content

Library for interfacing with data from the GLED project

Project description

Arca Verborum

Arca Verborum is a project to interface with the data from the GLED package.

The main available function is currently the one for performing weighted sampling of languages based on their phylogenetic distance, on their geographic distance accounting for areal effects (currently computed as a simple Haversine distance between the coordinates), and on the frequency among previous random samples.

When obtaining random samples for multiple iterations, it is strongly recommended to obtain all the samples in a single pass, so that the library can account for the potential oversampling of languages belonging to outgroups.

Note that the loading of the distance matrices, particularly of the geographic one, can take up to a minute on slower machines.

See code for more documentation, as below.

>>> import arcaverborum
>>> sampler = arcaverborum.GLED_Sampler()
WARNING:root:Loading the phylogenetic matrix from GLED...
WARNING:root:Loading the geographic matrix from GLED...
WARNING:root:Rescaling the phylogenetic matrix...
WARNING:root:Rescaling the geographic matrix...
>>> for idx, langset in enumerate(sampler.sample(4, 10)):
...   print(idx, langset)
... 
0 ('TlamacazapaNahuatl_tlam1239', 'GaviaoDoJiparana_gavi1246', 'Tubar_tuba1279', 'Pei_peii1238')
1 ('IslandCarib_isla1278', 'Samburu_samb1315', 'Dahalo_daha1245', 'Potawatomi_pota1247')
2 ('VlaxRomani_vlax1238', 'Gwahatike_gwah1244', 'NezPerce_nezp1238', 'Kwakwala_kwak1269')
3 ('AnaTingaDogon_anat1248', 'Zulgo-Gemzek_zulg1242', 'SkoltSaami_skol1241', 'Xokleng_xokl1240')
4 ('Mangarrayi_mang1381', 'Narak_nara1264', 'Matses_mats1244', 'Ionic-AtticAncientGreek_anci1242')
5 ('Jeli_jeri1242', 'Burum-Mindik_buru1306', 'Kistane_kist1241', 'Bongo_bong1285')
6 ('Patwin_patw1250', 'WesternTamang_west2415', 'Kapori_kapo1250', 'Sakha_yaku1245')
7 ('Kuy_kuyy1240', 'Kistane_kist1241', 'Kuruaya_kuru1309', 'Bolivar-NorthChimborazoHighlandQuichua_chim1302')
8 ('Betaf_beta1253', 'Bargam_barg1252', 'Pengo_peng1244', 'Wuding-LuquanYi_wudi1238')
9 ('NuclearWintu_nucl1651', 'Munit_muni1257', 'Nyawaygi_nyaw1247', 'MadaCameroon_mada1293')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arcaverborum-0.2.1.tar.gz (96.5 MB view details)

Uploaded Source

Built Distribution

arcaverborum-0.2.1-py3-none-any.whl (97.0 MB view details)

Uploaded Python 3

File details

Details for the file arcaverborum-0.2.1.tar.gz.

File metadata

  • Download URL: arcaverborum-0.2.1.tar.gz
  • Upload date:
  • Size: 96.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.1

File hashes

Hashes for arcaverborum-0.2.1.tar.gz
Algorithm Hash digest
SHA256 7901c685d5928987068fc96b36f236d72b9fe5d5d254c768a0bbba7c76cad44e
MD5 87633f04cb89ff91a2b0f0e883a4a4a2
BLAKE2b-256 358444fa2cbe64fd62923e26ce5047ca68e51e9bbb3aa2cc649cef17ec5a4a51

See more details on using hashes here.

File details

Details for the file arcaverborum-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: arcaverborum-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 97.0 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.1

File hashes

Hashes for arcaverborum-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 764cac5a43b0c233182cc3b1a0a82c796614845aea3f1a99fc1d5801f1f0b9fc
MD5 8de3a786783a60f510667c983b6244b5
BLAKE2b-256 0c668981af93cd8d788e3e6a028fe86ab3333cef7121466256307c8247480f66

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page