Skip to main content

Container to represent data from genomic experiments

Project description

Project generated with PyScaffold PyPI-Server Unit tests

SummarizedExperiment

This package provides containers to represent genomic experimental data as 2-dimensional matrices, follows Bioconductor's SummarizedExperiment. In these matrices, the rows typically denote features or genomic regions of interest, while columns represent samples or cells.

The package currently includes representations for both SummarizedExperiment and RangedSummarizedExperiment. A distinction lies in the fact RangedSummarizedExperiment object provides an additional slot to store genomic regions for each feature and is expected to be GenomicRanges (more here).

Install

To get started, Install the package from PyPI,

pip install summarizedexperiment

Usage

A SummarizedExperiment contains three key attributes,

  • assays: A dictionary of matrices with assay names as keys, e.g. counts, logcounts etc.
  • row_data: Feature information e.g. genes, transcripts, exons, etc.
  • column_data: Sample information about the columns of the matrices.

First lets mock feature and sample data:

from random import random
import pandas as pd
import numpy as np
from biocframe import BiocFrame

nrows = 200
ncols = 6
counts = np.random.rand(nrows, ncols)
row_data = BiocFrame(
    {
        "seqnames": [
            "chr1",
            "chr2",
            "chr2",
            "chr2",
            "chr1",
            "chr1",
            "chr3",
            "chr3",
            "chr3",
            "chr3",
        ]
        * 20,
        "starts": range(100, 300),
        "ends": range(110, 310),
        "strand": ["-", "+", "+", "*", "*", "+", "+", "+", "-", "-"] * 20,
        "score": range(0, 200),
        "GC": [random() for _ in range(10)] * 20,
    }
)

col_data = pd.DataFrame(
    {
        "treatment": ["ChIP", "Input"] * 3,
    }
)

To create a SummarizedExperiment,

from summarizedexperiment import SummarizedExperiment

tse = SummarizedExperiment(
    assays={"counts": counts}, row_data=row_data, column_data=col_data,
    metadata={"seq_platform": "Illumina NovaSeq 6000"},
)
## output
class: SummarizedExperiment
dimensions: (200, 6)
assays(1): ['counts']
row_data columns(6): ['seqnames', 'starts', 'ends', 'strand', 'score', 'GC']
row_names(0):
column_data columns(1): ['treatment']
column_names(0):
metadata(1): seq_platform

To create a RangedSummarizedExperiment

from summarizedexperiment import RangedSummarizedExperiment
from genomicranges import GenomicRanges

trse = RangedSummarizedExperiment(
    assays={"counts": counts}, row_data=row_data,
    row_ranges=GenomicRanges.from_pandas(row_data.to_pandas()), column_data=col_data
)
## output
class: RangedSummarizedExperiment
dimensions: (200, 6)
assays(1): ['counts']
row_data columns(6): ['seqnames', 'starts', 'ends', 'strand', 'score', 'GC']
row_names(0):
column_data columns(1): ['treatment']
column_names(0):
metadata(0):

For more examples, checkout the documentation.

Note

This project has been set up using PyScaffold 4.5. For details and usage information on PyScaffold see https://pyscaffold.org/.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

summarizedexperiment-0.4.6.tar.gz (1.0 MB view details)

Uploaded Source

Built Distribution

SummarizedExperiment-0.4.6-py3-none-any.whl (21.8 kB view details)

Uploaded Python 3

File details

Details for the file summarizedexperiment-0.4.6.tar.gz.

File metadata

  • Download URL: summarizedexperiment-0.4.6.tar.gz
  • Upload date:
  • Size: 1.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for summarizedexperiment-0.4.6.tar.gz
Algorithm Hash digest
SHA256 a83a52868dfc87f854534f0e105ae27ed4bee5df95a5bddc21b84b9e017870b8
MD5 5d7828aad835a19a7f8f84b96f0ccbba
BLAKE2b-256 6a6825eb77a189249384176faea870279b6cf5aeac547d59e045bee2e5e50649

See more details on using hashes here.

File details

Details for the file SummarizedExperiment-0.4.6-py3-none-any.whl.

File metadata

File hashes

Hashes for SummarizedExperiment-0.4.6-py3-none-any.whl
Algorithm Hash digest
SHA256 b4534db6b1cf757c2899871dd158c71ef0c6ace2bc57051a5713c21021fc2107
MD5 45b07408a50512d13e56ee0a5bd2b69c
BLAKE2b-256 b88b860996fc267ba46599e3dc58672af98121f340f3231f09a508358272ee06

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page