Skip to main content

PubMed nbib citation format parser

Project description

nbib

A parser of the nbib citation format exported by PubMed & other NCBI tools.

About

nbib is opinionated in what data it parses and how it is structured, with the aim of supporting the most common use cases. Unlike other parsers, which produce "flat" data (i.e. string key-value pairs), nbib:

  • Parses strings into their correct data type, as possible
  • Creates hierarchical and list objects when appropriate

Install

Install the latest production from PyPi:

pip install nbib

To install the latest dev version:

pip install git+https://github.com/holub008/nbib.git

Using

Example

import nbib
refs = nbib.read("""PMID- 1337\nTI  - `nbib` Rocks!\n\n""")

General

nbib does:

  • Provide parsing of both nbib files (read_file()) and strings (read())
  • Guarantee that the output format will remain backwards compatible, within a major release
    • The type of an attribute will never change within a major release
    • An attribute will never change name within a major release
    • New attributes may be added with a minor release
  • Guarantee that the order of output refs matches the incoming order. Moreover, this holds for all list attributes (e.g. authors)

nbib does not:

  • Allow users to customize parsing methods
    • nbib opines that performing the "obvious" parsing covers 99% of use cases, so don't push this work onto the client
  • Play nicely with improperly formatted files - exceptions are aggressively thrown for unexpected inputs
    • Given PubMed is effectively the sole producer of these files, the risk is minimal
    • Please report any issues encountered!
  • Have great run time performance
    • As of writing, a 10,000 ref file (PubMed max export size) of 829K lines took 9.2 seconds on a standard laptop. For comparison, the ris parsing package rispy, which produces flat string data, took 2.2 seconds for 10K refs (670K lines).
    • If your use case needs faster performance, please file an issue!

Developing

Issues and pull requests are always welcome.

Testing

To set up the project:

pipenv install --dev

To run tests:

pipenv run python -m pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nbib-0.3.2.tar.gz (9.6 kB view details)

Uploaded Source

Built Distribution

nbib-0.3.2-py3-none-any.whl (9.8 kB view details)

Uploaded Python 3

File details

Details for the file nbib-0.3.2.tar.gz.

File metadata

  • Download URL: nbib-0.3.2.tar.gz
  • Upload date:
  • Size: 9.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.9.1

File hashes

Hashes for nbib-0.3.2.tar.gz
Algorithm Hash digest
SHA256 343c27b79088f2cde7f0ee6605653a2e432dec6d30cf8c6e82dc2e4525087982
MD5 b98d1a0f457d4aa322534e15e7b98d14
BLAKE2b-256 f0c3a3bfe572d7133836571010ed17c32b8fa0eee9939d52d6485132b7d2a4b8

See more details on using hashes here.

File details

Details for the file nbib-0.3.2-py3-none-any.whl.

File metadata

  • Download URL: nbib-0.3.2-py3-none-any.whl
  • Upload date:
  • Size: 9.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.9.1

File hashes

Hashes for nbib-0.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 ee40530186de406d7b8d880d1740d15284026c0b885f161665c7e7f6772cd5ae
MD5 0c4b699b742da25d7454ffecec55d541
BLAKE2b-256 bad7788c12c066373216dec5728ff81f8939fe5fc85aa84f1f4c61722e338a41

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page