Container to represent data from genomic experiments
Project description
SummarizedExperiment
This package provides containers to represent genomic experimental data as 2-dimensional matrices, follows Bioconductor's SummarizedExperiment. In these matrices, the rows typically denote features or genomic regions of interest, while columns represent samples or cells.
The package currently includes representations for both SummarizedExperiment and RangedSummarizedExperiment. A distinction lies in the fact RangedSummarizedExperiment object provides an additional slot to store genomic regions for each feature and is expected to be GenomicRanges (more here).
Install
To get started, Install the package from PyPI,
pip install summarizedexperiment
Usage
A SummarizedExperiment contains three key attributes,
assays: A dictionary of matrices with assay names as keys, e.g. counts, logcounts etc.row_data: Feature information e.g. genes, transcripts, exons, etc.column_data: Sample information about the columns of the matrices.
First lets mock feature and sample data:
from random import random
import pandas as pd
import numpy as np
from biocframe import BiocFrame
nrows = 200
ncols = 6
counts = np.random.rand(nrows, ncols)
row_data = BiocFrame(
{
"seqnames": [
"chr1",
"chr2",
"chr2",
"chr2",
"chr1",
"chr1",
"chr3",
"chr3",
"chr3",
"chr3",
]
* 20,
"starts": range(100, 300),
"ends": range(110, 310),
"strand": ["-", "+", "+", "*", "*", "+", "+", "+", "-", "-"] * 20,
"score": range(0, 200),
"GC": [random() for _ in range(10)] * 20,
}
)
col_data = pd.DataFrame(
{
"treatment": ["ChIP", "Input"] * 3,
}
)
To create a SummarizedExperiment,
from summarizedexperiment import SummarizedExperiment
tse = SummarizedExperiment(
assays={"counts": counts}, row_data=row_data, column_data=col_data,
metadata={"seq_platform": "Illumina NovaSeq 6000"},
)
## output
class: SummarizedExperiment
dimensions: (200, 6)
assays(1): ['counts']
row_data columns(6): ['seqnames', 'starts', 'ends', 'strand', 'score', 'GC']
row_names(0):
column_data columns(1): ['treatment']
column_names(0):
metadata(1): seq_platform
To create a RangedSummarizedExperiment
from summarizedexperiment import RangedSummarizedExperiment
from genomicranges import GenomicRanges
trse = RangedSummarizedExperiment(
assays={"counts": counts}, row_data=row_data,
row_ranges=GenomicRanges.from_pandas(row_data.to_pandas()), column_data=col_data
)
## output
class: RangedSummarizedExperiment
dimensions: (200, 6)
assays(1): ['counts']
row_data columns(6): ['seqnames', 'starts', 'ends', 'strand', 'score', 'GC']
row_names(0):
column_data columns(1): ['treatment']
column_names(0):
metadata(0):
For more examples, checkout the documentation.
Note
This project has been set up using PyScaffold 4.5. For details and usage information on PyScaffold see https://pyscaffold.org/.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file summarizedexperiment-0.6.5.tar.gz.
File metadata
- Download URL: summarizedexperiment-0.6.5.tar.gz
- Upload date:
- Size: 1.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5eb163f65ebca53ad4430e339d5cc13119e318988a4c89661f4dc654f7448675
|
|
| MD5 |
b38d98b08f4a6ee2be2ce06729760f60
|
|
| BLAKE2b-256 |
5cd94a46f0acf15e621ddce469088d3560f4e60d54a7437355d103cc990e33e9
|
Provenance
The following attestation bundles were made for summarizedexperiment-0.6.5.tar.gz:
Publisher:
publish-pypi.yml on BiocPy/SummarizedExperiment
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
summarizedexperiment-0.6.5.tar.gz -
Subject digest:
5eb163f65ebca53ad4430e339d5cc13119e318988a4c89661f4dc654f7448675 - Sigstore transparency entry: 797718903
- Sigstore integration time:
-
Permalink:
BiocPy/SummarizedExperiment@7cce3d5e5986d6ee9e14cc6c7d684648980fa56f -
Branch / Tag:
refs/tags/0.6.5 - Owner: https://github.com/BiocPy
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@7cce3d5e5986d6ee9e14cc6c7d684648980fa56f -
Trigger Event:
push
-
Statement type:
File details
Details for the file summarizedexperiment-0.6.5-py3-none-any.whl.
File metadata
- Download URL: summarizedexperiment-0.6.5-py3-none-any.whl
- Upload date:
- Size: 23.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7124738c89bf13c4772ae03cde736870efdb26e488b44c87af76739161db4442
|
|
| MD5 |
fd8ecb4c3c6ff080f12b098fea212c29
|
|
| BLAKE2b-256 |
decac37db95b97e95da4b8152b264547dae5d165ba06022d75b7702ef03dba59
|
Provenance
The following attestation bundles were made for summarizedexperiment-0.6.5-py3-none-any.whl:
Publisher:
publish-pypi.yml on BiocPy/SummarizedExperiment
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
summarizedexperiment-0.6.5-py3-none-any.whl -
Subject digest:
7124738c89bf13c4772ae03cde736870efdb26e488b44c87af76739161db4442 - Sigstore transparency entry: 797718907
- Sigstore integration time:
-
Permalink:
BiocPy/SummarizedExperiment@7cce3d5e5986d6ee9e14cc6c7d684648980fa56f -
Branch / Tag:
refs/tags/0.6.5 - Owner: https://github.com/BiocPy
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@7cce3d5e5986d6ee9e14cc6c7d684648980fa56f -
Trigger Event:
push
-
Statement type: