Skip to main content

Package for working with OAPapers dataset.

Project description

OAPapersLoader

This repository contains python loaders for OAPapers corpus and derived datasets. It accompanies the repository https://github.com/KNOT-FIT-BUT/OAPapers and provides more lightweight solution without exhaustive dependencies to load the OAPapers corpus and derived datasets.

Install

pip install oapaersloader

Usage

An example of loading OARelatedWork dataset with references:

from oapapersloader.datasets import OARelatedWork, OADataset

with OARelatedWork("train.jsonl", "train.jsonl.index") as dataset, \
            OADataset("references.jsonl", "references.jsonl.index") as references:
    d = dataset[0]
    print("Document:", dataset[0].title)
    print("Cited paper:", references.get_by_id(d.citations[0]).title)

The OARelatedWork will load the target papers with related work sections and the OADataset will load dataset of all references that can be used for loading cited papers.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

oapapersloader-1.0.1.tar.gz (18.4 kB view details)

Uploaded Source

Built Distribution

oapapersloader-1.0.1-py3-none-any.whl (16.0 kB view details)

Uploaded Python 3

File details

Details for the file oapapersloader-1.0.1.tar.gz.

File metadata

  • Download URL: oapapersloader-1.0.1.tar.gz
  • Upload date:
  • Size: 18.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.14

File hashes

Hashes for oapapersloader-1.0.1.tar.gz
Algorithm Hash digest
SHA256 ed6d9b07590c0db010909362873ca5b1a2cb383ef5d7bd0fc8e11c80f32a95e8
MD5 1f5e800a8efe4983247285d2b41cc671
BLAKE2b-256 f9f58acd472a97de80ba8c17413a1f865a409ad78754923c90def6597da07dc3

See more details on using hashes here.

File details

Details for the file oapapersloader-1.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for oapapersloader-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5ddc48e901e497ce5ea013f6ea52efb7c31f17ea45d0a4e1114ad2d3c9d400d9
MD5 b7b0c8d38a7b3146a15190dd22429105
BLAKE2b-256 ff66095e216b779364c0512279fdf387d7aaff72f9395b9abae2a969c0e02724

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page