Skip to main content

Container to represent data from genomic experiments

Project description

Project generated with PyScaffold PyPI-Server Unit tests

SummarizedExperiment

Container to represent genomic experiments, follows Bioconductor's SummarizedExperiment.

Install

Package is published to PyPI,

pip install summarizedexperiment

Usage

Currently supports SummarizedExperiment & RangedSummarizedExperiment classes

First create necessary sample data

from random import random
import pandas as pd
import numpy as np
from biocframe import BiocFrame

nrows = 200
ncols = 6
counts = np.random.rand(nrows, ncols)
row_data = BiocFrame(
    {
        "seqnames": [
            "chr1",
            "chr2",
            "chr2",
            "chr2",
            "chr1",
            "chr1",
            "chr3",
            "chr3",
            "chr3",
            "chr3",
        ]
        * 20,
        "starts": range(100, 300),
        "ends": range(110, 310),
        "strand": ["-", "+", "+", "*", "*", "+", "+", "+", "-", "-"] * 20,
        "score": range(0, 200),
        "GC": [random() for _ in range(10)] * 20,
    }
)

col_data = pd.DataFrame(
    {
        "treatment": ["ChIP", "Input"] * 3,
    }
)

To create a SummarizedExperiment,

from summarizedexperiment import SummarizedExperiment

tse = SummarizedExperiment(
    assays={"counts": counts}, row_data=row_data, column_data=col_data,
    metadata={"seq_platform": "Illumina NovaSeq 6000"},
)
## output
class: SummarizedExperiment
dimensions: (200, 6)
assays(1): ['counts']
row_data columns(6): ['seqnames', 'starts', 'ends', 'strand', 'score', 'GC']
row_names(0):
column_data columns(1): ['treatment']
column_names(0):
metadata(1): seq_platform

To create a RangedSummarizedExperiment

from summarizedexperiment import RangedSummarizedExperiment
from genomicranges import GenomicRanges

trse = RangedSummarizedExperiment(
    assays={"counts": counts}, row_data=row_data,
    row_ranges=GenomicRanges.from_pandas(row_data.to_pandas()), column_data=col_data
)
## output
class: RangedSummarizedExperiment
dimensions: (200, 6)
assays(1): ['counts']
row_data columns(6): ['seqnames', 'starts', 'ends', 'strand', 'score', 'GC']
row_names(0):
column_data columns(1): ['treatment']
column_names(0):
metadata(0):

For more examples, checkout the documentation.

Note

This project has been set up using PyScaffold 4.5. For details and usage information on PyScaffold see https://pyscaffold.org/.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

SummarizedExperiment-0.4.0.tar.gz (1.0 MB view hashes)

Uploaded Source

Built Distribution

SummarizedExperiment-0.4.0-py3-none-any.whl (20.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page