matrix market tar archive access utilities
Project description
Introduction
mex_gene_archive is a minimal file format designed to meet the needs of archiving sparse gene matrices in a format compatible with the ENCODE 4 Data Coordination Center.
We had the requirement that a data type result needed to be a single file and unfortunately the common output format for alignment programs of the matrix market exchange use three files. One to store the coordinates and values of the non-zero sparse matrix elements, one for the row labels, and one for the column labels.
Usage
Reading an archive
The archive format is fairly simple and started with just archiving the key matrix market files from a STAR Solo.out directory, with a simple manifest.tsv file included to help tell different files apart.
Probably the more useful function is the one that will read an archive into an anndata structure with the gene features going across the columns and the cell barcode observations going down across the rows.
from mex_gene_archive.reader import read_mex_archive_as_anndata
adata = read_mex_archive_as_anndata("archive.tar.gz")
req = requests.get(
"https://www.encodeproject.org/files/ENCFFexample/@@download/ENCFFexample.fastq.gz",
stream=True)
adata = read_mex_archive_as_anndata(fileobj=req.raw)
The reader module can also convert archives to anndata directly from the command line
python -m mex_gene_archive.reader -o archive.h5ad archive.tar.gz
python -m mex_gene_archive.reader -o archive.h5ad \
--url https://www.encodeproject.org/files/ENCFFexample/@@download/ENCFFexample.fastq.gz
Generating an STAR archive
Possibly you might want to generate an archive file currently only STAR is directly supported. See archive_star_solo for the full list of arguments.
from mex_gene_archive.starsolo import archive_star_solo
config = {
"experiment_accession": "ENCSR724KET",
"description": "snRNA on human adrenal gland.",
"library_accession": "ENCLB002DZK",
}
archive_star_solo("experiment", config)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file mex_gene_archive-0.2.3.tar.gz
.
File metadata
- Download URL: mex_gene_archive-0.2.3.tar.gz
- Upload date:
- Size: 28.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ecd509f4d28b44257296c377e135c00f55298ad5bd84950c79de8f56c3d27748 |
|
MD5 | f6c0fc65a0fac48d6c8996f0c765eabe |
|
BLAKE2b-256 | bb8e63631eb26e65955050063442659eab161b6fcd3fcd9da1cbfd99d05c38b5 |
File details
Details for the file mex_gene_archive-0.2.3-py3-none-any.whl
.
File metadata
- Download URL: mex_gene_archive-0.2.3-py3-none-any.whl
- Upload date:
- Size: 29.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 33489be5638faf239fc3d28abef9063c4babd440bbcc55e0c1be42b8575a5401 |
|
MD5 | bb399e2e324ca6bffe134582608d68d7 |
|
BLAKE2b-256 | 0f63f3429df0b28afb6bf3117f6b52e7031c34975af2033df9d71948cbb052e6 |