Skip to main content

matrix market tar archive access utilities

Project description

ci-test

Introduction

mex_gene_archive is a minimal file format designed to meet the needs of archiving sparse gene matrices in a format compatible with the ENCODE 4 Data Coordination Center.

We had the requirement that a data type result needed to be a single file and unfortunately the common output format for alignment programs of the matrix market exchange use three files. One to store the coordinates and values of the non-zero sparse matrix elements, one for the row labels, and one for the column labels.

Usage

Reading an archive

The archive format is fairly simple and started with just archiving the key matrix market files from a STAR Solo.out directory, with a simple manifest.tsv file included to help tell different files apart.

Probably the more useful function is the one that will read an archive into an anndata structure with the gene features going across the columns and the cell barcode observations going down across the rows.

from mex_gene_archive.reader import read_mex_as_anndata

adata = read_mex_as_anndata("archive.tar.gz")

req = requests.get(
    "https://www.encodeproject.org/files/ENCFFexample/@@download/ENCFFexample.fastq.gz",
    stream=True)
adata = read_mex_as_anndata(fileobj=req.raw)

The reader module can also convert archives to anndata directly from the command line

python -m mex_gene_archive.reader -o archive.h5ad archive.tar.gz

python -m mex_gene_archive.reader -o archive.h5ad \
  --url https://www.encodeproject.org/files/ENCFFexample/@@download/ENCFFexample.fastq.gz

Generating an STAR archive

Possibly you might want to generate an archive file currently only STAR is directly supported. See archive_star_solo for the full list of arguments.

from mex_gene_archive.starsolo import archive_star_solo

config = {
   "experiment_accession": "ENCSR724KET",
   "description": "snRNA on human adrenal gland.",
   "library_accession": "ENCLB002DZK",
}
archive_star_solo("experiment", config)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mex_gene_archive-0.2.0.tar.gz (24.5 kB view details)

Uploaded Source

Built Distribution

mex_gene_archive-0.2.0-py3-none-any.whl (25.3 kB view details)

Uploaded Python 3

File details

Details for the file mex_gene_archive-0.2.0.tar.gz.

File metadata

  • Download URL: mex_gene_archive-0.2.0.tar.gz
  • Upload date:
  • Size: 24.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.0 importlib_metadata/4.6.4 pkginfo/1.8.2 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.9.9

File hashes

Hashes for mex_gene_archive-0.2.0.tar.gz
Algorithm Hash digest
SHA256 748ddcc9e27ee2228d1511f6cf0e4787b67f0508aa38837e45d0c22a8654d658
MD5 3f4524efc27494782d0bfbb6a241c2ed
BLAKE2b-256 be19a76756f37ee3d92650c23a6c2516f3d030be1081a4631b097a7d624e841b

See more details on using hashes here.

File details

Details for the file mex_gene_archive-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: mex_gene_archive-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 25.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.0 importlib_metadata/4.6.4 pkginfo/1.8.2 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.9.9

File hashes

Hashes for mex_gene_archive-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c97302f88c669e155cbda76a7cfb57f514edadbc74087b5ee5a4355c4014493f
MD5 3a4a2a267ce1f55b2bf684d897934331
BLAKE2b-256 e97ef65de21d6c4358644512907036a28b2ae7819592fa134fa361fdb0144218

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page