Python interface to NCBI's eutilities API
Project description
eutils -- simplified interface to NCBI E-Utilities
eutils is a Python package to simplify searching, fetching, and parsing records from NCBI using their E-utilities interface
Features
- simple Pythonic interface for searching and fetching
- Support for NCBI API keys, and rate throttling when no key is available
- optional sqlite-based caching of compressed replies
- "façades" that facilitate access to essential attributes in XML replies
- Github repository: https://github.com/biocommons/eutils/
- Documentation https://eutils.readthedocs.io/en/stable/
Example Usage
$ uv pip install eutils
$ export NCBI_API_KEY=8d4b...
$ ipython
>>> import os
>>> from biocommons.eutils import Client
# Initialize a client. This client handles all caching and query
# throttling. For example:
>>> ec = Client(api_key=os.environ.get("NCBI_API_KEY", None))
# search for tumor necrosis factor genes
# any valid NCBI query may be used
>>> esr = ec.esearch(db='gene',term='tumor necrosis factor')
# esearch returns a list of entity IDs associated with your search. preview some of them:
>>> esr.ids[:5]
[136114222, 136113226, 136112112, 136111930, 136111620]
# fetch data for an ID (gene id 7157 is human TNF)
>>> egs = ec.efetch(db='gene', id=7157)
# One may fetch multiple genes at a time. These are returned as an
# EntrezgeneSet. We'll grab the first (and only) child, which returns
# an instance of the Entrezgene class.
>>> eg = egs.entrezgenes[0]
# Easily access some basic information about the gene
>>> eg.hgnc, eg.maploc, eg.description, eg.type, eg.genus_species
('TP53', '17p13.1', 'tumor protein p53', 'protein-coding', 'Homo sapiens')
# get a list of genomic references
>>> sorted([(r.acv, r.label) for r in eg.references])
[('NC_000017.11', 'Chromosome 17 Reference GRCh38...'),
('NC_018928.2', 'Chromosome 17 Alternate ...'),
('NG_017013.2', 'RefSeqGene')]
# Get the first three products defined on GRCh38
>>> [p.acv for p in eg.references[0].products][:3]
['NM_001126112.2', 'NM_001276761.1', 'NM_000546.5']
# As a sample, grab the first product defined on this reference (order is arbitrary)
>>> mrna = [i for i in eg.references[0].products if i.type == "mRNA"][0]
>>> str(mrna)
'GeneCommentary(acv=NM_001126112.2,type=mRNA,heading=Reference,label=transcript variant 2)'
# mrna.genomic_coords provides access to the exon definitions on this reference
>>> mrna.genomic_coords.gi, mrna.genomic_coords.strand
('568815581', -1)
>>> mrna.genomic_coords.intervals
[(7687376, 7687549), (7676520, 7676618), (7676381, 7676402),
(7675993, 7676271), (7675052, 7675235), (7674858, 7674970),
(7674180, 7674289), (7673700, 7673836), (7673534, 7673607),
(7670608, 7670714), (7668401, 7669689)]
# and if the mrna has a product, the resulting protein:
>>> str(mrna.products[0])
'GeneCommentary(acv=NP_001119584.1,type=peptide,heading=Reference,label=isoform a)'
Developer Setup
Install Prerequisites
These tools are required to get started:
- git: Version control system
- GNU make: Current mechanism for consistent invocation of developer tools.
- uv: An extremely fast Python package and project manager, written in Rust.
MacOS or Linux Systems
- Install brew
brew install git make uv
Linux (Debian-based systems)
You may also install using distribution packages:
sudo apt install git make
Then install uv using the uv installation instructions.
One-time developer setup
Create a Python virtual environment, install dependencies, install pre-commit hooks, and install an editable package:
make devready
Development
N.B. Developers are strongly encouraged to use make to invoke tools to
ensure consistency with the CI/CD pipelines. Type make to see a list of
supported targets. A subset are listed here:
» make
🌟🌟 biocommons conventional make targets 🌟🌟
Using these targets promots consistency between local development and ci/cd commands.
usage: make [target ...]
BASIC USAGE
help Display help message
SETUP, INSTALLATION, PACKAGING
devready Prepare local dev env: Create virtual env, install the pre-commit hooks
build Build package
publish publish package to PyPI
FORMATTING, TESTING, AND CODE QUALITY
cqa Run code quality assessments
test Test the code with pytest
DOCUMENTATION
docs-serve Build and serve the documentation
docs-test Test if documentation can be built without warnings or errors
CLEANUP
clean Remove temporary and backup files
cleaner Remove files and directories that are easily rebuilt
cleanest Remove all files that can be rebuilt
distclean Remove untracked files and other detritus
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file eutils-0.6.1.tar.gz.
File metadata
- Download URL: eutils-0.6.1.tar.gz
- Upload date:
- Size: 435.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
68d4e007996d4b08171a936413f6ec2cd4c045ac83acf7df9e9b7110df06c030
|
|
| MD5 |
6f8d5060bec4537d3ed67c1a779c221e
|
|
| BLAKE2b-256 |
117773b46f2f1c1f714456dd84d36e2b1c1c989c88423f74d5773f884606f3a9
|
File details
Details for the file eutils-0.6.1-py3-none-any.whl.
File metadata
- Download URL: eutils-0.6.1-py3-none-any.whl
- Upload date:
- Size: 40.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6916efd10f397f20ba0e6bd5b84d4e868e077161509e240d7c4ab1d98fb2d3b1
|
|
| MD5 |
ef1247f4858d66ff5e0bae9b4ca19104
|
|
| BLAKE2b-256 |
26b5d343da782460999bd3e7c3c367b91d7b77f2eaf424bff7b315ce72bb4e54
|