pybio genomics
Project description
pybio: basic genomics toolset
pybio is a comprehensive Python framework designed to streamline genomics operations. It offers a direct interface to Ensembl genome assemblies and annotations, while also accommodating custom genomes via FASTA/GTF inputs. The primary objective of pybio is to simplify genome management. It achieves this by providing automatic download of Ensembl genome assemblies and annotation, provides Python genomic feature search and sequence retrieval from the managed genomes, STAR indexing and mapping and more.
Quick Start
Install and download + prepare human genome:
# Option 1: install over PyPi
pip install pybio
# Option 2: install from this repository
pip install git+https://github.com/grexor/pybio.git@master
# Option 3: use over singularity / apptainer / Docker (only if you don't need python imports)
singularity run docker://ghcr.io/grexor/pybio:master pybio
# Download and process homo sapiens genome
pybio genome homo_sapiens
Search genome features (exons, transcripts, genes) from Python:
import pybio
result = pybio.core.genomes.annotate("homo_sapiens", "1", "+", 11012344)
genes, transcripts, exons, UTR5, UTR3 = result
Retrieve genomic sequences from Python:
import pybio
seq = pybio.core.genomes.seq("homo_sapiens", "1", "+", 450000, -20, 20)
Check documentation for more examples.
Documentation
- PDF reference manual
- Google docs of the above PDF (comment if you like)
Authors
pybio is developed and supported by Gregor Rot.
Issues and Suggestions
Use the issues page to report issues and leave suggestions.
Change log
0.8: May 2025
- aimux: added short-read paired end sequencing demultiplexing tool
0.7: February 2025
- alignIntronMax for STAR
- other small fixes
v0.6.3: December 2024
- updated setup.py to use an entry point instead of a script
- removed
pybioscripts
v0.6: November 2024
- updated Ensembl search and genome versioning offline
- updated custom genome interface
v0.5: May 2024
- refreshed Ensembl (112) and Ensembl Genomes (58) database
v0.4: April 2024
- refreshed Ensembl (111) and Ensembl Genomes (58) database
v0.3.12: released in November 2023
- updated docs
Citation
If you are using pybio in your research, please cite:
Rot, G., Wehling, A., Schmucki, R., Berntenis, N., Zhang, J. D., & Ebeling, M. (2024)
splicekit : an integrative toolkit for splicing analysis from short-read RNA-seq
Bioinformatics Advances, 4(1). https://doi.org/10.1093/bioadv/vbae121
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pybio-0.8.4.tar.gz.
File metadata
- Download URL: pybio-0.8.4.tar.gz
- Upload date:
- Size: 425.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cf2e078d5496fa58a5365dbce6615c8873cf976987475db31c42aba26c64f962
|
|
| MD5 |
60d25cc3af40f530e2fdc4a52fd3c979
|
|
| BLAKE2b-256 |
b47fe7a7b33ac68fab76994503a5e78ce91f3a52ae9c7a30b5e6d0353272025e
|
File details
Details for the file pybio-0.8.4-py3-none-any.whl.
File metadata
- Download URL: pybio-0.8.4-py3-none-any.whl
- Upload date:
- Size: 43.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4e0be2ba509b5f4fd3136624fffef6b0e81e735c5bd28f0a6ad69719926604a8
|
|
| MD5 |
c7c5db7daba075c02b56b1a943b13b00
|
|
| BLAKE2b-256 |
db34dd01e1fec1ef336ce1cadb17e5812077893a6f7a3bfbef8de58cde613bfc
|