A client for Nerdpool-Api
Project description
nerdpool-client
A Python client for downloading data from https://nerdpool-api.acdh-dev.oeaw.ac.at
install
pip install nerdpool_client
usage
list data set titles
from nerdpool_client import NerdPoolClient
client = NerdPoolClient()
print(client.data_sets)
# ['RTA', 'RITA', 'MRP', 'Chronik Aldersbach', 'DIPKO']
download samples as .jsonl file
- go to nerdpool-api and create/filter you'r prefered data sample; e.g. all samples from MRP:
from nerdpool_client import NerdPoolClient
url = "https://nerdpool-api.acdh-dev.oeaw.ac.at/api/ner-sample/?format=json&ner_ent_type__contains=&ner_source__title=MRP"
client = NerdPoolClient()
client.dump_to_jsonl(url)
# 'out.jsonl'
download samples as test.jsonl and eval.jsonl files
- With
file_name_prefix
you can add a custom prefix to the default file namestrain.jsonl
andeval.jsonl
- The param
split
defines that eachsplit
sample should be saved intoeval.jsonl
and not intotrain.jsonl
from nerdpool_client import NerdPoolClient
url = "https://nerdpool-api.acdh-dev.oeaw.ac.at/api/ner-sample/?format=json&ner_ent_type__contains=&ner_source__title=MRP"
client = NerdPoolClient()
client.dump_to_train_eval(url, file_name_prefix="mrp__", split=10)
# ['mrp__train.jsonl', 'mrp__eval.jsonl]
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
nerdpool-client-1.1.0.tar.gz
(3.1 kB
view hashes)
Built Distribution
Close
Hashes for nerdpool_client-1.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 09cdef5114b592d14c83ef22698ee768fad31b63c13acec65da993d053f6908e |
|
MD5 | 636c412c302be8548cdcb50c94159410 |
|
BLAKE2b-256 | ddae32b85107dc552c112e93b9dbf3bc834296f487c0bd9b43b56e67fce99b7a |