Skip to main content

Python interface to NCBI's eutilities API

Project description

eutils -- simplified interface to NCBI E-Utilities

Release Build status codecov Commit activity License

eutils is a Python package to simplify searching, fetching, and parsing records from NCBI using their E-utilities interface

Features

  • simple Pythonic interface for searching and fetching
  • Support for NCBI API keys, and rate throttling when no key is available
  • optional sqlite-based caching of compressed replies
  • "façades" that facilitate access to essential attributes in XML replies

Example Usage

$ uv pip install eutils
$ export NCBI_API_KEY=8d4b...
$ ipython

>>> import os
>>> from biocommons.eutils import Client

# Initialize a client. This client handles all caching and query
# throttling.  For example:
>>> ec = Client(api_key=os.environ.get("NCBI_API_KEY", None))

# search for tumor necrosis factor genes
# any valid NCBI query may be used
>>> esr = ec.esearch(db='gene',term='tumor necrosis factor')

# esearch returns a list of entity IDs associated with your search. preview some of them:
>>> esr.ids[:5]
[136114222, 136113226, 136112112, 136111930, 136111620]

# fetch data for an ID (gene id 7157 is human TNF)
>>> egs = ec.efetch(db='gene', id=7157)

# One may fetch multiple genes at a time. These are returned as an
# EntrezgeneSet. We'll grab the first (and only) child, which returns
# an instance of the Entrezgene class.
>>> eg = egs.entrezgenes[0]

# Easily access some basic information about the gene
>>> eg.hgnc, eg.maploc, eg.description, eg.type, eg.genus_species
('TP53', '17p13.1', 'tumor protein p53', 'protein-coding', 'Homo sapiens')

# get a list of genomic references
>>> sorted([(r.acv, r.label) for r in eg.references])
[('NC_000017.11', 'Chromosome 17 Reference GRCh38...'),
('NC_018928.2', 'Chromosome 17 Alternate ...'),
('NG_017013.2', 'RefSeqGene')]

# Get the first three products defined on GRCh38
>>> [p.acv for p in eg.references[0].products][:3]
['NM_001126112.2', 'NM_001276761.1', 'NM_000546.5']

# As a sample, grab the first product defined on this reference (order is arbitrary)
>>> mrna = [i for i in eg.references[0].products if i.type == "mRNA"][0]
>>> str(mrna)
'GeneCommentary(acv=NM_001126112.2,type=mRNA,heading=Reference,label=transcript variant 2)'

# mrna.genomic_coords provides access to the exon definitions on this reference
>>> mrna.genomic_coords.gi, mrna.genomic_coords.strand
('568815581', -1)

>>> mrna.genomic_coords.intervals
[(7687376, 7687549), (7676520, 7676618), (7676381, 7676402),
(7675993, 7676271), (7675052, 7675235), (7674858, 7674970),
(7674180, 7674289), (7673700, 7673836), (7673534, 7673607),
(7670608, 7670714), (7668401, 7669689)]

# and if the mrna has a product, the resulting protein:
>>> str(mrna.products[0])
'GeneCommentary(acv=NP_001119584.1,type=peptide,heading=Reference,label=isoform a)'

Developer Setup

Install Prerequisites

These tools are required to get started:

  • git: Version control system
  • GNU make: Current mechanism for consistent invocation of developer tools.
  • uv: An extremely fast Python package and project manager, written in Rust.

MacOS or Linux Systems

Linux (Debian-based systems)

You may also install using distribution packages:

sudo apt install git make

Then install uv using the uv installation instructions.

One-time developer setup

Create a Python virtual environment, install dependencies, install pre-commit hooks, and install an editable package:

make devready

Development

N.B. Developers are strongly encouraged to use make to invoke tools to ensure consistency with the CI/CD pipelines. Type make to see a list of supported targets. A subset are listed here:

» make
🌟🌟 biocommons conventional make targets 🌟🌟

Using these targets promots consistency between local development and ci/cd commands.

usage: make [target ...]

BASIC USAGE
help                Display help message

SETUP, INSTALLATION, PACKAGING
devready            Prepare local dev env: Create virtual env, install the pre-commit hooks
build               Build package
publish             publish package to PyPI

FORMATTING, TESTING, AND CODE QUALITY
cqa                 Run code quality assessments
test                Test the code with pytest

DOCUMENTATION
docs-serve          Build and serve the documentation
docs-test           Test if documentation can be built without warnings or errors

CLEANUP
clean               Remove temporary and backup files
cleaner             Remove files and directories that are easily rebuilt
cleanest            Remove all files that can be rebuilt
distclean           Remove untracked files and other detritus

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

eutils-0.6.1.tar.gz (435.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

eutils-0.6.1-py3-none-any.whl (40.9 kB view details)

Uploaded Python 3

File details

Details for the file eutils-0.6.1.tar.gz.

File metadata

  • Download URL: eutils-0.6.1.tar.gz
  • Upload date:
  • Size: 435.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.14

File hashes

Hashes for eutils-0.6.1.tar.gz
Algorithm Hash digest
SHA256 68d4e007996d4b08171a936413f6ec2cd4c045ac83acf7df9e9b7110df06c030
MD5 6f8d5060bec4537d3ed67c1a779c221e
BLAKE2b-256 117773b46f2f1c1f714456dd84d36e2b1c1c989c88423f74d5773f884606f3a9

See more details on using hashes here.

File details

Details for the file eutils-0.6.1-py3-none-any.whl.

File metadata

  • Download URL: eutils-0.6.1-py3-none-any.whl
  • Upload date:
  • Size: 40.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.14

File hashes

Hashes for eutils-0.6.1-py3-none-any.whl
Algorithm Hash digest
SHA256 6916efd10f397f20ba0e6bd5b84d4e868e077161509e240d7c4ab1d98fb2d3b1
MD5 ef1247f4858d66ff5e0bae9b4ca19104
BLAKE2b-256 26b5d343da782460999bd3e7c3c367b91d7b77f2eaf424bff7b315ce72bb4e54

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page