Skip to main content

Container class to represent genomic locations and support genomic analysis.

Project description

Project generated with PyScaffold PyPI-Server Unit tests

GenomicRanges

GenomicRanges is a Python container class designed to represent genomic locations and support genomic analysis. It is similar to Bioconductor's GenomicRanges.

Install

Package is published to PyPI

pip install genomicranges

Usage

The package provides several ways to represent genomic annotations and intervals.

Initialize a GenomicRanges object

From UCSC or GTF file

You can easily access UCSC genomes or load a genome annotation from a GTF file using the following methods:

import genomicranges

gr = genomicranges.from_gtf(<PATH TO GTF>)
# OR
gr = genomicranges.from_ucsc(genome="hg19")

Pandas DataFrame

A common representation in Python is a pandas DataFrame for all tabular datasets. You can convert a DataFrame into a GenomicRanges object. Please note that intervals are inclusive on both ends, and your DataFrame must contain columns seqnames, starts, and ends to represent genomic coordinates.

Here's an example:

import genomicranges
import pandas as pd

df = pd.DataFrame(
    {
        "seqnames": ["chr1", "chr2", "chr1", "chr3", "chr2"],
        "starts": [101, 102, 103, 104, 109],
        "ends": [112, 103, 128, 134, 111],
        "strand": ["*", "-", "*", "+", "-"],
        "score": range(0, 5),
        "GC": [random() for _ in range(5)],
    }
)

gr = genomicranges.from_pandas(df)

Interval Operations

GenomicRanges currently supports most commonly used interval based operations.

subject = genomicranges.from_ucsc(genome="hg38")

query = genomicranges.from_pandas(
    pd.DataFrame(
        {
            "seqnames": ["chr1", "chr2", "chr3"],
            "starts": [100, 115, 119],
            "ends": [103, 116, 120],
        }
    )
)

hits = subject.nearest(query)
print(hits)

For more usage examples, check out the documentation.

Note

This project has been set up using PyScaffold 4.1.1. For details and usage information on PyScaffold see https://pyscaffold.org/.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

GenomicRanges-0.3.6.tar.gz (55.3 kB view hashes)

Uploaded Source

Built Distribution

GenomicRanges-0.3.6-py3-none-any.whl (35.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page