Skip to main content

Structured Python interface to NCBI E-Utilities.

Project description

eutils is a Python package to simplify searching, fetching, and parsing records from NCBI using their E-utilities interface.

STATUS: This code is alpha. There are no known bugs, but the code supports only a limited subset of E-Utilities replies. PubMed, Gene, RefSeq (nucleotide), and dbSNP data are well-represented; others are not represented at all.

pypi_badge build_status Source

Features

  • simple Pythonic interface for searching and fetching

  • automatic query rate throttling per NCBI guidelines

  • optional sqlite-based caching of compressed replies

  • “façades” that facilitate access to essential attributes in replies

A Quick Example

$ pip install eutils
$ ipython

>>> import eutils.client

# Initialize a client. This client handles all caching and query
# throttling
>>> ec = eutils.client.Client()

# search for tumor necrosis factor genes
# any valid NCBI query may be used
>>> esr = ec.esearch(db='gene',term='tumor necrosis factor')

# fetch one of those (gene id 7157 is human TNF)
>>> egs = ec.efetch(db='gene', id=7157)

# One may fetch multiple genes at a time. These are returned as an
# EntrezgeneSet. We'll grab the first (and only) child, which returns
# an instance of the Entrezgene class.
>>> eg = egs.entrezgenes[0]

# Easily access some basic information about the gene
>>> eg.hgnc, eg.maploc, eg.description, eg.type, eg.genus_species
('TP53', '17p13.1', 'tumor protein p53', 'protein-coding', 'Homo sapiens')

# get a list of genomic references
>>> sorted([(r.acv, r.label) for r in eg.references])
[('NC_000017.11', 'Chromosome 17 Reference GRCh38...'),
 ('NC_018928.2', 'Chromosome 17 Alternate ...'),
 ('NG_017013.2', 'RefSeqGene')]

# Get the first three products defined on GRCh38
#>>> [p.acv for p in eg.references[0].products][:3]
#['NM_001126112.2', 'NM_001276761.1', 'NM_000546.5']

# As a sample, grab the first product defined on this reference (order is arbitrary)
>>> mrna = eg.references[0].products[0]
>>> str(mrna)
'GeneCommentary(acv=NM_001126112.2,type=mRNA,heading=Reference,label=transcript variant 2)'

# mrna.genomic_coords provides access to the exon definitions on this reference

>>> mrna.genomic_coords.gi, mrna.genomic_coords.strand
('568815581', -1)

>>> mrna.genomic_coords.intervals
[(7687376, 7687549), (7676520, 7676618), (7676381, 7676402),
(7675993, 7676271), (7675052, 7675235), (7674858, 7674970),
(7674180, 7674289), (7673700, 7673836), (7673534, 7673607),
(7670608, 7670714), (7668401, 7669689)]

# and the mrna has a product, the resulting protein:
>>> str(mrna.products[0])
'GeneCommentary(acv=NP_001119584.1,type=peptide,heading=Reference,label=isoform a)'

Important Notes

  • You are encouraged to browse issues. Please report any issues you find.

  • Use a pip package specification to ensure stay within minor releases for API stability. For example, eutils >=0.1,<0.2.

Developing and Contributing

Contributions of bug reports, code patches, and documentation are welcome!

Development occurs in the default branch. Please work in feature branches or bookmarks from the default branch. Feature branches should be named for the eutils issue they fix, as in 121-update-xml-facades. When merging, use a commit message like “closes #121: update xml facades to new-style interface”. (“closes #n” is recognized automatically and closes the ticket upon pushing.)

The included Makefile automates many tasks. In particular, make develop prepares a development environment and make test runs unittests. (Please run tests before committing!)

Again, thanks for your contributions.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

eutils-0.3.0.post0.tar.gz (275.2 kB view details)

Uploaded Source

Built Distributions

eutils-0.3.0.post0-py2.py3-none-any.whl (41.2 kB view details)

Uploaded Python 2 Python 3

eutils-0.3.0.post0-py2.7.egg (74.9 kB view details)

Uploaded Source

File details

Details for the file eutils-0.3.0.post0.tar.gz.

File metadata

File hashes

Hashes for eutils-0.3.0.post0.tar.gz
Algorithm Hash digest
SHA256 cc2a6387d8658285ce187e29f575acfd314870da8a3c27761bbedba4d0ee8efa
MD5 0f88adbb6c5c66381364cb18024c0e4f
BLAKE2b-256 33e34d581cf56c2287b4047b3eabcabee6de62a3036c03658e5b3d57e410170e

See more details on using hashes here.

File details

Details for the file eutils-0.3.0.post0-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for eutils-0.3.0.post0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 66a5b1bbf747f210f0f5527f9102b747fc96d9ea2f0d845c86dcb1d9abee4a37
MD5 63243ced5dbdb25c9bcccd36a1d5f8b5
BLAKE2b-256 56dec130e5c94b6fd2b2c020463e3428aca91bd130775106afcffa61797e8a29

See more details on using hashes here.

File details

Details for the file eutils-0.3.0.post0-py2.7.egg.

File metadata

File hashes

Hashes for eutils-0.3.0.post0-py2.7.egg
Algorithm Hash digest
SHA256 b4d3b00f537a3b6231e27c7c3e04469e143505a0722e11e9b453cf3028680fb5
MD5 745cc161e0364c7c9b4fb22f008f171a
BLAKE2b-256 e800e633f90c026f8034b8f1b7ad14f9bd7e53aab791e98e526faa777b84ad93

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page