Skip to main content

SC-Elephant (Single-Cell Extremely Large Data Analysis Platform)

Project description

scelephant-logo

SC-Elephant (Single-Cell Extremely Large Data Analysis Platform)

PyPI version

SC-Elephant utilizes RamData, a novel single-cell data storage format, to support a wide range of single-cell bioinformatics applications in a highly scalable manner, while providing a convenient interface to export any subset of the single-cell data in SCANPY's AnnData format, enabling efficient downstream analysis of the cells of interest. The analysis result can then be made available to other researchers by updating the original RamData, which can be stored in cloud storage like AWS (or any AWS-like object storage).

SC-Elephant and RamData enable real-time sharing of extremely large single-cell data using a browser-based analysis platform as it is being modified on the cloud by multiple other researchers, convenient integration of a local single-cell dataset with multiple large remote datasets (RamData objects uploaded by other researchers), and remote (private) collaboration on an extremely large-scale single-cell genomics dataset.

Tutorials can be found at doc/jn/

Tutorial 1) Processing and analysis of the 3k PBMCs dataset using SC-Elephant

Tutorial 2) Alignment of PBMC3k to the ELDB (320,000 cells subset) and cell type prediction using SC-Elephant

Tutorial 3) Combine 10x MEX count matrices memory-efficiently using SC-Elephant

Tutorial 4) Convert existing AnnData into RamData for collaborative data sharing

Briefly, a RamData object is composed of two RamDataAxis (Axis) objects and multiple RamDataLayer (Layer) objects.

ramdata_struc

The two RamDataAxis objects, 'Barcode' and 'Feature' objects, use 'filter' to select cells (barcodes) and genes (features) before retrieving data from the RamData object, respectively.

ramdata_struc

RamData employs RAMtx (Random-accessible matrix) objects to store count matrix in sparse or dense formats.

RamData greatly simplify sharing of very large single-cell datasets on the Web. Once processed by SC-Elephant, RamData can be uploaded to GitHub, Amazon S3 Cloud, or any static file servers to share your single-cell datasets publicly with the research community or privately with your collaborators. The machine learning models, kNN graphs, cell-type annotations, and random-accessible expression count matrices (to name a few) of your single-cell datasets on the Web can be easily explored in Python environments and web browsers using SC-Elephant and SC-Elephant.js, respectively.

To explore RamData objects publicly available on the Web using a web browser, please visit our SC-Elephant DB Viewer.

scelephant-js-example

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scelephant-0.2.7.tar.gz (261.1 kB view details)

Uploaded Source

Built Distribution

scelephant-0.2.7-py3-none-any.whl (267.4 kB view details)

Uploaded Python 3

File details

Details for the file scelephant-0.2.7.tar.gz.

File metadata

  • Download URL: scelephant-0.2.7.tar.gz
  • Upload date:
  • Size: 261.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.4

File hashes

Hashes for scelephant-0.2.7.tar.gz
Algorithm Hash digest
SHA256 3197606048ba618dceec75057c34497629d20becd42adcf600ec3b871f644b74
MD5 b687211ed066387ec44a12b173e0ad25
BLAKE2b-256 1a51d9a596b6f2b747c655b5f561906e81de88938e67c47b3bfe1cfe47e028ad

See more details on using hashes here.

File details

Details for the file scelephant-0.2.7-py3-none-any.whl.

File metadata

  • Download URL: scelephant-0.2.7-py3-none-any.whl
  • Upload date:
  • Size: 267.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.4

File hashes

Hashes for scelephant-0.2.7-py3-none-any.whl
Algorithm Hash digest
SHA256 6d0be9db4ec702833e3def91cbd9d5015cb73f9f7cc20664dbd728355a5d2c7d
MD5 c7a15524cc96b11e0cfc031dcf8c148d
BLAKE2b-256 32c5c50a829aefe8127b90b6ac178b8615b845b28233c008a43ac306225ceb5e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page