Skip to main content

Represent genomic annotations in Python. Equivalent to Bioconductors [GRanges](https://bioconductor.org/packages/release/bioc/html/GenomicRanges.html)

Project description

GenomicRanges

Container class to represent genomic locations and support genomic analysis in Python similar to Bioconductor's GenomicRanges.

Install

Package is published to PyPI

pip install genomicranges

Usage

The package provide several ways to represent genomic annotations and intervals.

Initialize a GenomicRanges object

From UCSC or GTF file

Methods are available to easily access UCSC genomes or load a genome annotation from GTF

import genomicranges

gr = genomicranges.fromGTF(<PATH TO GTF>)
# OR 
gr = genomicranges.fromUCSC(genome="hg19")

Pandas DataFrame

A common representation in Python is a pandas DataFrame for all tabular datasets. One can convert this into GenomicRanges. Intervals are inclusive on both ends.

Note: The DataFrame must contain columns seqnames, starts and ends to represent genomic coordinates.

import genomicranges
import pandas as pd

df = pd.DataFrame(
    {
        "seqnames": ["chr1", "chr2", "chr1", "chr3", "chr2"],
        "starts": [101, 102, 103, 104, 109],
        "ends": [112, 103, 128, 134, 111],
        "strand": ["*", "-", "*", "+", "-"],
        "score": range(0, 5),
        "GC": [random() for _ in range(5)],
    }
)

gr = genomicranges.fromPandas(df)

Interval Operations

Currently supports most commonly used interval based operations.

subject = genomicranges.fromUCSC(genome="hg38")

query = genomicranges.fromPandas(
    pd.DataFrame(
        {
            "seqnames": ["chr1", "chr2", "chr3"],
            "starts": [100, 115, 119],
            "ends": [103, 116, 120],
        }
    )
)

hits = subject.nearest(query)
print(hits)

Checkout the documentation for more usecases.

Note

This project has been set up using PyScaffold 4.1.1. For details and usage information on PyScaffold see https://pyscaffold.org/.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

GenomicRanges-0.2.9.tar.gz (46.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

GenomicRanges-0.2.9-py3-none-any.whl (26.6 kB view details)

Uploaded Python 3

File details

Details for the file GenomicRanges-0.2.9.tar.gz.

File metadata

  • Download URL: GenomicRanges-0.2.9.tar.gz
  • Upload date:
  • Size: 46.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for GenomicRanges-0.2.9.tar.gz
Algorithm Hash digest
SHA256 180d29b33cface5ffef099dd0f5c3ea2d10dd36b7e9a3eb5dfbd3c2813bb4ef8
MD5 62d9fded4f7403afa5aae2e418c7791e
BLAKE2b-256 1e88e667c037a2eaab66e1a647d6cb65d0bf569d34bbff8666278606c916c0fc

See more details on using hashes here.

File details

Details for the file GenomicRanges-0.2.9-py3-none-any.whl.

File metadata

  • Download URL: GenomicRanges-0.2.9-py3-none-any.whl
  • Upload date:
  • Size: 26.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for GenomicRanges-0.2.9-py3-none-any.whl
Algorithm Hash digest
SHA256 42c3383fc3a8a93f5be170c63c349d1f72cd517558b410a99a910a8f4968b081
MD5 6336a2e1a085cc2bee71793957513bd3
BLAKE2b-256 c2b8a4ff6c8aed9625bfb7e574c1eb47d4595ab4d97b333d21667ca97d952caa

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page