Represent genomic annotations in Python. Equivalent to Bioconductors [GRanges](https://bioconductor.org/packages/release/bioc/html/GenomicRanges.html)
Project description
GenomicRanges
Container class to represent genomic locations and support genomic analysis in Python similar to Bioconductor's GenomicRanges.
Install
Package is published to PyPI
pip install genomicranges
Usage
The package provide several ways to represent genomic annotations and intervals.
Initialize a GenomicRanges
object
From UCSC or GTF file
Methods are available to easily access UCSC genomes or load a genome annotation from GTF
import genomicranges
gr = genomicranges.fromGTF(<PATH TO GTF>)
# OR
gr = genomicranges.fromUCSC(genome="hg19")
Pandas DataFrame
A common representation in Python is a pandas DataFrame for all tabular datasets. One can convert this into GenomicRanges
. Intervals are inclusive on both ends.
Note: The DataFrame must contain columns seqnames
, starts
and ends
to represent genomic coordinates.
import genomicranges
import pandas as pd
df = pd.DataFrame(
{
"seqnames": ["chr1", "chr2", "chr1", "chr3", "chr2"],
"starts": [101, 102, 103, 104, 109],
"ends": [112, 103, 128, 134, 111],
"strand": ["*", "-", "*", "+", "-"],
"score": range(0, 5),
"GC": [random() for _ in range(5)],
}
)
gr = genomicranges.fromPandas(df)
Interval Operations
Currently supports most commonly used interval based operations.
subject = genomicranges.fromUCSC(genome="hg38")
query = genomicranges.fromPandas(
pd.DataFrame(
{
"seqnames": ["chr1", "chr2", "chr3"],
"starts": [100, 115, 119],
"ends": [103, 116, 120],
}
)
)
hits = subject.nearest(query)
print(hits)
Checkout the documentation for more usecases.
Note
This project has been set up using PyScaffold 4.1.1. For details and usage information on PyScaffold see https://pyscaffold.org/.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for GenomicRanges-0.2.9-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 42c3383fc3a8a93f5be170c63c349d1f72cd517558b410a99a910a8f4968b081 |
|
MD5 | 6336a2e1a085cc2bee71793957513bd3 |
|
BLAKE2b-256 | c2b8a4ff6c8aed9625bfb7e574c1eb47d4595ab4d97b333d21667ca97d952caa |