Skip to main content

Efficient relational database queries over the entire Crossref abnd ORCID data sets

Project description

Alexandria3k CI

Alexandria3k

The alexandria3k package supplies a library and a command-line tool providing efficient relational query access to diverse publication open data sets. The largest one is the entire Crossref data set (157 GB compressed, 1 TB uncompressed). This contains publication metadata from about 134 million publications from all major international publishers with full citation data for 60 million of them. Alternatively, scientific publications can be selected from the PubMed data set (43 GB compressed, 327 GB uncompressed), which comprises more than 36 million citations for biomedical literature from MEDLINE, life science journals, and online books, with rich domain-specific metadata, such as MeSH indexing, funding, genetic, and chemical details. Other data sets that can be used or linked together are the ORCID summary data set (25 GB compressed, 435 GB uncompressed), containing about 78 million author records, the DataCite set of research outputs and resources, such as data, pre-prints, images, and samples, (22 GB compressed, 197 GB uncompressed), containing about 50 million work entries, the United States Patent Office issued patents (11 GB compressed, 115 GB uncompressed), containing about 5.4 million records, as well as data sets of funder bodies, journal names, open access journals, and research organizations.

The alexandria3k package installation contains all elements required to run it. It does not require the installation, configuration, and maintenance of a third party relational or graph database. It can therefore be used out-of-the-box for performing reproducible publication research on the desktop.

Documentation

The complete reference and use documentation for alexandria3k can be found here.

Major contributors

Publication

Details about the rationale, design, implementation, and use of this software can be found in the following paper.

Diomidis Spinellis. Open reproducible scientometric research with Alexandria3k. PLoS ONE 18(11): e0294946. November 2023. doi: 10.1371/journal.pone.0294946

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

alexandria3k-3.3.0.tar.gz (655.1 kB view details)

Uploaded Source

Built Distribution

alexandria3k-3.3.0-py3-none-any.whl (123.5 kB view details)

Uploaded Python 3

File details

Details for the file alexandria3k-3.3.0.tar.gz.

File metadata

  • Download URL: alexandria3k-3.3.0.tar.gz
  • Upload date:
  • Size: 655.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.7

File hashes

Hashes for alexandria3k-3.3.0.tar.gz
Algorithm Hash digest
SHA256 6314d720bc0bc3b206e90d4c6f112c743cc721d17d229efb963fdbdd9d911a87
MD5 99d3d7ca8ed7787890aee44165fbeaad
BLAKE2b-256 c352a0d8c5a44303cbbac00021065feaee17552bf32200f8bde9a5832d12cf9e

See more details on using hashes here.

File details

Details for the file alexandria3k-3.3.0-py3-none-any.whl.

File metadata

  • Download URL: alexandria3k-3.3.0-py3-none-any.whl
  • Upload date:
  • Size: 123.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.7

File hashes

Hashes for alexandria3k-3.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6af938bb8aac902c71b3349ba3af8b2ea17170f80aa37a549dd12696f81e4548
MD5 6f50a79183a0b835551059adde3a50ea
BLAKE2b-256 fb437228c744e0658251efc70181b70716555b4457f311954c85b8869c86ceba

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page