Skip to main content

Parse URLs for DOIs, PubMed identifiers, PMC identifiers, arXiv identifiers, etc.

Project description

citation-url

Tests Cookiecutter template from @cthoyt PyPI PyPI - Python Version PyPI - License Code style: black

Parse URLs for DOIs, PubMed identifiers, PMC identifiers, arXiv identifiers, etc.

This module has a single parse() function that takes in a URL and gives back a parse status (success, unknown, or irreconcilable), a prefix, and an identifier. If the status is unknown or irreconcilable, the prefix will be left as None and the identifier will match the input:

>>> from citation_url import parse, Status

>>> parse("https://joss.theoj.org/papers/10.21105/joss.01708")
(Status.success, 'doi', '10.21105/joss.01708')

>>> parse("http://www.ncbi.nlm.nih.gov/pubmed/34739845")
(Status.success, 'pubmed', '34739845')

>>> parse("https://example.com/true-garbage")
(Status.unknown, None, 'https://example.com/true-garbage')

>>> parse("https://example.com/true-garbage")
(Status.unknown, None, 'https://example.com/true-garbage')

>>> parse("http://msb.embopress.org/content/13/11/954.full.pdf")
(Status.irreconcilable, None, 'http://msb.embopress.org/content/13/11/954.full.pdf')

🕵️ Why?

I wanted to be able to curate a list of papers in Zotero, Mendeley, or any other modern citation manager, make an XML export in the EndNote format, extract and normalize the messy contents in the electronic-resource-num, text-urls, and pdf-urls fields, then ensure that there are corresponding entries on Wikidata using the Su Lab's Wikidata Integrator.

Reuse this functionality with:

$ python -m citation_url.endnote --help

🚀 Installation

The most recent release can be installed from PyPI with:

$ pip install citation_url

The most recent code and data can be installed directly from GitHub with:

$ pip install git+https://github.com/cthoyt/citation-url.git

👐 Contributing

Contributions, whether filing an issue, making a pull request, or forking, are appreciated. See CONTRIBUTING.md for more information on getting involved.

👋 Attribution

⚖️ License

The code in this package is licensed under the MIT License.

🍪 Cookiecutter

This package was created with @audreyfeldroy's cookiecutter package using @cthoyt's cookiecutter-snekpack template.

🛠️ For Developers

See developer instrutions

The final section of the README is for if you want to get involved by making a code contribution.

Development Installation

To install in development mode, use the following:

$ git clone git+https://github.com/cthoyt/citation-url.git
$ cd citation-url
$ pip install -e .

🥼 Testing

After cloning the repository and installing tox with pip install tox, the unit tests in the tests/ folder can be run reproducibly with:

$ tox

Additionally, these tests are automatically re-run with each commit in a GitHub Action.

📖 Building the Documentation

$ tox -e docs

📦 Making a Release

After installing the package in development mode and installing tox with pip install tox, the commands for making a new release are contained within the finish environment in tox.ini. Run the following from the shell:

$ tox -e finish

This script does the following:

  1. Uses Bump2Version to switch the version number in the setup.cfg and src/citation_url/version.py to not have the -dev suffix
  2. Packages the code in both a tar archive and a wheel
  3. Uploads to PyPI using twine. Be sure to have a .pypirc file configured to avoid the need for manual input at this step
  4. Push to GitHub. You'll need to make a release going with the commit where the version was bumped.
  5. Bump the version to the next patch. If you made big changes and want to bump the version by minor, you can use tox -e bumpversion minor after.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

citation_url-0.1.1.tar.gz (16.5 kB view details)

Uploaded Source

Built Distribution

citation_url-0.1.1-py3-none-any.whl (10.8 kB view details)

Uploaded Python 3

File details

Details for the file citation_url-0.1.1.tar.gz.

File metadata

  • Download URL: citation_url-0.1.1.tar.gz
  • Upload date:
  • Size: 16.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.11.0 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.10.1

File hashes

Hashes for citation_url-0.1.1.tar.gz
Algorithm Hash digest
SHA256 1d6abac84d84f41d7549e611213305b14f5aa25a93d859c8c6c60739df57efad
MD5 0ed05c5a0a35560de074579239cc080e
BLAKE2b-256 9f513fd89cc6e8eae58073dcd2494ac2f720c77b40feca6263f5411acd12181a

See more details on using hashes here.

File details

Details for the file citation_url-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: citation_url-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 10.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.11.0 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.10.1

File hashes

Hashes for citation_url-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4dcc8291406ed4194ab3673a8c56ef04544e5278747fda65dbeeea760db4eab8
MD5 c48a4138d30c79d481ced383fba61ce0
BLAKE2b-256 f56d9625cc8b22021e41f2f3da95a917ed29ee330ee32f6f51823c824d2cac77

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page