Rich Context API integrations for federating discovery services and metadata exchange across multiple scholarly infrastructure providers
Project description
richcontext.scholapi
Rich Context API integrations for federating discovery services and metadata exchange across multiple scholarly infrastructure providers.
Development of the Rich Context knowledge graph uses this library to:
- identify dataset links to research publications
- locate open access publications
- reconcile journal references
- reconcile author profiles
- reconcile keyword mesh
This library has been guided by collaborative work on community building and metadata exchange to improve Scholarly Infrastructure, held at the 2019 Rich Context Workshop.
Installation
Prerequisites:
- Python 3.x
- Beautiful Soup
- Biopython.Entrez
- Crossref Commons
- Dimensions CLI
- Requests
- Requests-Cache
- Selenium
- xmltodict
To install from PyPi:
pip install richcontext.scholapi
If you install directly from this Git repo, be sure to install the dependencies as well:
pip install -r requirements.txt
Then copy the configuration file template rc_template.cfg
to rc.cfg
and populate it with your credentials.
NB: be careful not to commit the rc.cfg
file in Git since it
contains sensitive data such as passwords.
Parameters used in the configuration file include:
parameter | value |
---|---|
chrome_exe_path |
path/to/chrome.exe |
dimensions_password |
Dimensions API password |
elsevier_api_key |
Elsvier API key |
email |
personal email address |
repec_token |
RePEc API token |
Download the Chrome webdriver to enable use of Selenium.
For a good (although slightly dated) tutorial for installing and testing Selenium on Ubuntu Linux, see: https://christopher.su/2015/selenium-chromedriver-ubuntu/
Usage
from richcontext import scholapi as rc_scholapi
import pprint
# initialize the federated API access
schol = rc_scholapi.ScholInfraAPI(config_file="rc.cfg", logger=None)
# search parameters for example publications
title = "Deal or no deal? The prevalence and nutritional quality of price promotions among U.S. food and beverage purchases."
# run it...
meta = schol.openaire.title_search(title)
# report results
pprint.pprint(meta)
print("\ntime: {:.3f} ms - {}".format(schol.openaire.elapsed_time, schol.openaire.name))
API Integrations
APIs used to retrieve metadata:
See docs/enrich_pubs.ipynb
for example API usage to pull the
federated metadata for a publication.
For more background about open access publications see:
Piwowar H, Priem J, Larivière V, Alperin JP, Matthias L, Norlander B, Farley A, West J, Haustein S. 2017.
The State of OA: A large-scale analysis of the prevalence and impact of Open Access articles
PeerJ Preprints 5:e3119v1
https://doi.org/10.7287/peerj.preprints.3119v1
Testing
First, be sure that you're testing the source and not from an installed library.
Then run unit tests for the APIs which do not require credentials:
nose2 -v --pretty-assert
To Do
- parse HTML-embedded metadata from the web pages for PMC/Pubmed, NIH, etc.
- Scholexplorer
- DataCite integration
- PapersWithCode
- Springer
If you'd like to contribute, please see our listings of good first issues
Kudos
Contributors: @ceteri, @srand525, @IanMulvany, plus many thanks for the inspiring 2019 Rich Context Workshop notes by @metasj, and guidance from @claytonrsh, @Juliaingridlane.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for richcontext-scholapi-1.0.2.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0de2099c0ae1130e2f795d888295ab8bf2a07bc735aef1d54b3f7bcc4cdbd014 |
|
MD5 | f5e1a64fd65db4b90937f1a503a352c4 |
|
BLAKE2b-256 | b3479ba89e3abbf8371228a453ca8ed81c40702ebf90f2ef023f6be700c12349 |
Hashes for richcontext_scholapi-1.0.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c7a0eded585b83155b8e2a6764cdffb8d88d8b39cbd07bcda608df3bc8ad3119 |
|
MD5 | b44c9f7ab3d7ec7c953893b132478a41 |
|
BLAKE2b-256 | fb28e65c57e3c42b85580ee397a49317c4f2e978e3fd96953af9ded7004f7fa7 |