Functionality to retrieve CLDF datasets deposited on Zenodo
Project description
cldfzenodo
cldfzenodo
provides programmatic access to CLDF data deposited on Zenodo.
Install
pip install cldfzenodo
CLI
cldfzenodo
provides a subcommand to be run from cldfbench.
To make use of this command, you have to install cldfbench
, which can be done via
pip install cldfzenodo[cli]
Then you can download CLDF datasets from Zenodo, using the DOI for identification. E.g.
cldfbench zenodo.download 10.5281/zenodo.4683137 --directory wals-2020.1/
will download WALS Online as CLDF dataset into wals-2020.1
:
$ tree wals-2020.1/
wals-2020.1/
├── areas.csv
├── chapters.csv
├── codes.csv
├── contributors.csv
├── countries.csv
├── examples.csv
├── language_names.csv
├── languages.csv
├── parameters.csv
├── sources.bib
├── StructureDataset-metadata.json
└── values.csv
0 directories, 12 files
API
Metadata and data of (potential) CLDF datasets deposited on Zenodo is accessed via cldfzenodo.Record
objects. Such objects can be obtained in various ways:
- Via DOI:
import cldfzenodo rec = cldfzenodo.Record.from_doi('https://doi.org/10.5281/zenodo.4762034')
- From deposits grouped into a Zenodo community (and obtained through OAI-PMH):
import cldfzenodo.oai for rec in cldfzenodo.oai.iter_records('dictionaria'): print(rec)
- From search results using keywords:
import cldfzenodo for rec in cldfzenodo.search_wordlists(): print(rec)
cldfzenodo.Record
objects provide sufficient metadata to allow identification and data access:
>>> from cldfzenodo import Record
>>> print(Record.from_doi('10.5281/zenodo.4762034').bibtex)
@misc{zenodo-4762034,
author = {Hammarström, Harald and Forkel, Robert and Haspelmath, Martin and Bank, Sebastian},
title = {glottolog/glottolog: Glottolog database 4.4 as CLDF},
keywords = {cldf:StructureDataset, linguistics},
publisher = {Zenodo},
year = {2021},
doi = {10.5281/zenodo.4762034},
url = {https://doi.org/10.5281/zenodo.4762034},
copyright = {Creative Commons Attribution 4.0}
}
One can download the full deposit (and access - possible multiple - CLDF datasets):
from pycldf import iter_datasets
record.download('my_directory')
for cldf in iter_datasets('my_directory'):
pass
But often, only the "pure" CLDF data is of interest - and not the additional metadata and curation context, e.g. of cldfbench-curated datasets. This can be done via
cldf = record.download_dataset('my_directory')
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for cldfzenodo-0.3.0-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e785ce737de3baa4651fb7786557ce6d0397bb0195b1fd54d8153dbbce868a82 |
|
MD5 | cb474f50bce3db99a9591c4811ec8af2 |
|
BLAKE2b-256 | a67a575f586602141e56179f9a7f256e80d8d5e1583292be9459bbb6f2cbbe27 |