Skip to main content

A command line tool to aggregate research output metadata from DataCite, OpenAIRE and OpenAlex based on an organisations ROR ID

Project description

Research output aggregator

The goal of this project is to create a script to get a summarization for a research organization about the research output.
First target is to query and process information from DataCite.

The goal for this script is to create a list over research output where an organization is mentioned as:

  • publisher
  • creator with affiliation to the organization
  • contributor with affiliation to the organization

Input: ROR-id and list of variants on the organization name.

Properties to collect for each research output:

Field Type Comment
publicationYear integer The year of publication, can be empty in some cases
resourceType string The resource type (free text string)
title string Title of the resource (first one if multiple)
publisher string Publisher (free text)
createdAt string Created date if available
updatedAt string Updatade date if availible
isPublisher bool True if the the publisher match the requested organisation
isFunder bool True if the the funder match the requested organisation
haveCreatorAffiliation bool True if the the any creator match the requested organisation
haveContributorAffiliation bool True if the the any contributor match the requested organisation
isLatestVersion bool True if the DataCite metadata indicates this beeing the latest version
isConceptDoi bool True if the DataCite metadata indicates this beeing a concept DOI
matchPublisherRor bool True if the ROR id for publisher match the ROR in the provided argument
matchCreatorAffiliationRor bool True if the ROR id for a creator affiliation match the ROR in the provided argument
matchContributorAffiliationRor bool True if the ROR id for a contributor affiliation match the ROR in the provided argument
matchFunderRor bool True if the ROR id for funder match the ROR in the provided argument
matchPublisherName bool True if any of the names supplied matches the publisher name in the resource
matchCreatorName bool True if any of the names supplied matches the creator name in the resource
matchContributorName bool True if any of the names supplied matches the contributor name in the resource
matchFunderName bool True if any of the names supplied matches the funder name in the resource
inDataCite bool True if the DOI was matched in the DataCite
inOpenAire bool True if the DOI was matched in OpenAire
inOpenAlex bool True if the DOI was matched in OpenAlex
inCrossRef bool True if the DOI was matched in CrossRef
dataCiteClientId string The client id for the organisation minting the DOI
dataCiteClientName string The human readable name of the minting organisation
dataCiteCitationCount integer Citation count for the resource provided by the DataCite API
dataCiteReferenceCount integer Reference count for the resource provided by the DataCite API
dataCiteViewCount integer View count for the resource provided by the DataCite API
dataCiteDownloadCount integer Download count for the resource provided by the DataCite API
openAireBestAccessRight string Access Rights for the resource indicated indicated by the OpenAire API
openAireIndicatorsUsageCountsDownloads integer Download count for the resource indicated by the OpenAire API
openAireIndicatorsUsageCountsViews integer View count for the resource provided by the OpenAire API
openAireId string Id for the resource in OpenAire
openAlexId string Id for the resource in OpenAlex
openAlexCitedByCount integer Citation count for the resource provided by the OpenAlex API
openAlexReferencedWorksCount integer Reference count for the resource provided by the OpenAlex API
titleWordCount integer Number of words in the title (useful for sorting in some cases)
referencedByDoi string DOI of object(s) (for instance papers) referencing this object (JSON list)

Install

pip install roagg

Run

List arguments:
roagg --help

Install dev

git clone git@github.com:snd-sweden/roagg.git
cd roagg
pip install -e .

Tests

Some tests are available, to run them:
python -m pytest

Development stuff to do

  • ROR get name variants from ROR
  • CLI add options to get name list from txt
  • DataCite API build query for matching publisher and affiliation
  • Publish as cmd tool on PyPI
  • Crossref API build query for matching publisher and affiliation

Some example arguments

Chalmers with ror and name list:

roagg --ror https://ror.org/040wg7k59 --name-txt tests/name-lists/chalmers.txt --output chalmers.csv

GU with ror, name list and extra name not in the text file:

roagg --name "Department of Nephrology Gothenburg" --ror https://ror.org/01tm6cn81 --name-txt tests/name-lists/gu.txt --output data/gu.csv

KTH with ror and name list:

roagg --ror https://ror.org/026vcq606 --name-txt tests/name-lists/kth.txt --output data/kth.csv

KAU with ror:

roagg --ror https://ror.org/05s754026 --output kau.csv

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

roagg-2026.0.1.tar.gz (15.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

roagg-2026.0.1-py3-none-any.whl (15.0 kB view details)

Uploaded Python 3

File details

Details for the file roagg-2026.0.1.tar.gz.

File metadata

  • Download URL: roagg-2026.0.1.tar.gz
  • Upload date:
  • Size: 15.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for roagg-2026.0.1.tar.gz
Algorithm Hash digest
SHA256 e26f6e8d09e4852bfd1d347231553ab33e3b429228fb4796cf233653da2542e5
MD5 a857c7a9be5488b62495c0cb9aadbedd
BLAKE2b-256 5b41d4d2026e560aad04044b57c9652b6866a53654acfd7645de17b10cae4a1f

See more details on using hashes here.

Provenance

The following attestation bundles were made for roagg-2026.0.1.tar.gz:

Publisher: python-publish.yml on snd-sweden/roagg

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file roagg-2026.0.1-py3-none-any.whl.

File metadata

  • Download URL: roagg-2026.0.1-py3-none-any.whl
  • Upload date:
  • Size: 15.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for roagg-2026.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 af800d03645896669cea5c0f4070ef524d6b71256c84023b403fc7f96b8243c7
MD5 af7bd9812bc3ebbfaa948496ca7c1658
BLAKE2b-256 ded1898bc2a6643c176675e1bb9b9f6ca8c9794b5c2b6058f8946f32fde61cc6

See more details on using hashes here.

Provenance

The following attestation bundles were made for roagg-2026.0.1-py3-none-any.whl:

Publisher: python-publish.yml on snd-sweden/roagg

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page