scelvis·PyPI

Single-cell RNA-seq data visualization

These details have not been verified by PyPI

Project links

Homepage

Project description

SCelVis: Easy Single-Cell Visualization

https://img.shields.io/conda/dn/bioconda/scelvis.svg?label=Bioconda

https://img.shields.io/pypi/pyversions/scelvis.svg

https://img.shields.io/pypi/v/scelvis.svg

https://api.codacy.com/project/badge/Grade/9ee0ec1424c143dfad9977a649f917f7

https://api.codacy.com/project/badge/Coverage/9ee0ec1424c143dfad9977a649f917f7

https://travis-ci.org/bihealth/scelvis.svg?branch=master

Installation

The only prerequisite is Python 3, everything else will be installed together with the scelvis package.

You can install SCelVis and its dependencies using pip or through conda:

$ pip install scelvis
# OR
$ conda install scelvis

A Docker container is also available via Quay.io/Biocontainers.

$ docker run quay.io/biocontainers/scelvis:TAG scelvis --help
$ docker run -p 8050:8050 -v data:/data quay.io/biocontainers/scelvis:TAG scelvis run --data-source /data

Lookup the latest TAG to use at here.

Tutorial

explore 1000 cells from a 1:1 Mixture of Fresh Frozen Human (HEK293T) and Mouse (NIH3T3) Cells (10X v3 chemistry) or a published dataset of ~14000 IFN-beta treated and control PBMCs from 8 donors (GSE96583; see Kang et al.)

$ scelvis run --data-source /path/to/scelvis/examples/hgmm_1k.h5ad
$ scelvis run --data-source https://files.figshare.com/18037739/pbmc.h5ad

and then point your browser to http://0.0.0.0:8050/.

Preparing Your Data

Data sets are provided as HDF5 files (anndata objects) that store gene expression (sparse CSR matrix) and meta data with very fast read access.

For the input you can either specify one HDF5 file or a directory containing multiple such files.

You can use scanpy to create this HDF5 file directly or use the scelvis convert command for converting your single-cell pipeline output.

HDF5 Input

for HDF5 input, you can do your analysis with scanpy to create an anndata object ad. SCelVis will use embedding coordinates from ad.obsm, cell annotation from ad.obs and expression data directly from ad.X (this should contain normalized and log-transformed expression values for all genes). If present, information about the dataset will be extracted from strings stored in ad.uns['about_title'], ad.uns['about_short_title'] and ad.uns['about_readme'] (assumed to be Markdown). Information about marker genes will be taken either from the rank_genes_groups slot in ad.uns or from entries starting with marker_ in ad.uns: entries called marker_gene (required!), marker_cluster, marker_padj, marker_LFC will create a table with the columns gene, cluster, padj, and LFC.

If you prepared your data with Seurat (v2), you can use Convert(from = sobj, to = "anndata", filename = "data.h5ad") to get an HDF5 file.

Text Input

For “raw” text input, you need to prepare at least three files in the input directory:

expression.tsv.gz, a tab-separated file with normalized expression values for each gene (rows) in each cell (columns), e.g., like this:

.       cell_1   cell_2   cell_3  ...
gene_1  0.13     0.0      1.5     ...
gene_2  0.0      3.1      0.3     ...
gene_3  0.0      0.0      0.0     ...

annotation.tsv, a tab-separated file with annotations for each cell, e.g., like this:

.         cluster     genotype  ...
cell_1    cluster_1   WT        ...
cell_2    cluster_2   KO        ...

coords.tsv, a tab-separated file with embedding coordinates for each cell, e.g., like this:

.         tSNE_1   tSNE_2   UMAP_1  UMAP_2  ...
cell_1    20.53    -10.05   3.9     2.4     ...
cell_2    -5.34    13.94    -1.3    3.4     ...

markers.tsv, an optional tab-separated file with marker genes and it needs to have a column named ``gene``, e.g., like this:

gene    cluster     log2FC   adj_pval   ...
gene_1  cluster_1   3.4      1.5e-6     ...
gene_2  cluster_1   1.3      0.00004    ...
gene_3  cluster_2   2.1      5.3e-9     ...

a markdown file (e.g., text_input.md) with information about this dataset:

----
title: An Optional Long Data Set Title
short_title: optional short title
----

A verbose description of the data in Markdown format.

$ scelvis convert --input-dir text_input --output data/text_input.h5ad --about-md text_input.md

in examples/dummy_raw.zip and examples/dummy_about.md we provide raw data for a simulated dummy dataset.

Loom Input

for loompy or loomR input, you can convert your data like this:

$ scelvis convert --i input.loom -m markers.tsv -a about.md -o loom_input.h5ad

if you prepared your data with Seurat (v3), you can use as.loom(sobj, filename="output.loom") to get a .loom file and then convert to .h5ad with the above command.

CellRanger Input

Alternatively, the output directory of CellRanger can be used. This is the directory called outs containing either a file called filtered_gene_bc_matrices_h5.h5 (version 2) or a file called filtered_feature_bc_matrix.h5 (version 3), and a folder analysis with clustering, embedding and differential expression results. This will not no any further processing except log-normalization. Additionally, a markdown file provides meta information about the dataset (see above)

$ mkdir -p data
$ cat <<EOF > data/cellranger.md
----
title: My Project
short_title: my_project
----

This is my project data.
EOF
$ scelvis convert --input-dir cellranger-out --output data/cellranger_input.h5ad --about-md cellranger.md

In examples/hgmm_1k_raw we provide CellRanger output for the 1k 1:1 human mouse mix. Specifically, from the outs folder we selected

filtered_feature_bc_matrix.h5
tSNE and PCA projections from analysis/tsne and analysis/pca
clustering from analysis/clustering/graphclust and
markers from analysis/diffexp/graphclust

examples/hgmm_1k_about.md contains information about this dataset

Visualizing Your Data

$ tree data
data
├── text_input.h5ad
└── cellranger_input.h5ad

$ scelvis run --data-source data/cellranger_input.h5ad
# OR
$ scelvis run --data-source data

Data Sources

Data sources can be:

paths, e.g., relative/paths or /absolute/paths or file://url/paths
SFTP URLs, e.g., sftp://user:password@host/path/to/data
FTP URLs, e.g., ftp://user:password@host/path/to/data (sadly encryption is not supported by the underlying library PyFilesystem2.
iRODS URLS, e.g., irods://user:password@host/zoneName/path/to/data
- Enable SSL via irods+ssl
- Switch to PAM authentication with irods+pam (you can combine this with +ssl in any order)
- Enable ticket access by appending ?ticket=TICKET.
HTTP(S) URLs, e.g., https://user:password@host/path/to/data.
S3 URLs, e.g., s3://bucket/path, optionally s3://key:token@bucket/path.

Data sources can either point to HDF5 files directly or to directories containing multiple HDF5 files. The only exception is iRODS with ticket-based access. Because of technical restrictions, you have to assign a unique ticket for each data set and specify the data sets individually.

Environment Variables

You can use the following environment variables to configure the server.

SCELVIS_DATA_SOURCES – semicolon-separated list of data sources
SCELVIS_HOST – host specification for web server to listen on
SCELVIS_PORT – port for web server to listen on
SCELVIS_CACHE_DIR – directory to use for the cache (default is to create a temporary directory)
SCELVIS_CACHE_REDIS_URL – enable caching with REDIS and provide connection URL
SCELVIS_CACHE_DEFAULT_TIMEOUT – cache lifetime coverage
SCELVIS_UPLOAD_DIR – the directory to store uploaded data sets in (default is to create a temporary directory)
SCELVIS_UPLOAD_DISABLED – set to “0” to disable upload feature
SCELVIS_CONVERSION_DISABLED – set to “0” to disable the conversion feature
SCELVIS_URL_PREFIX – set if you want to run scelvis below a non-root path (e.g., behind a reverse proxy)

Developer Setup

The prerequisites are:

Python 3, either
- system-wide installation with virtualenv, or
- installed with Conda.

For virtualenv, first create a virtual environment and activate it.

$ virtualenv -p venv
$ source venv/bin/activate

For a Conda-based setup create a new environment and activate it.

$ conda create -y -n scelvis 'python>=3.6'
$ conda activate scelvis

Next, clone the repository and install the software as editable (-e). Also install the development requirements to get helpers such as black.

$ git clone git@github.com:bihealth/scelvis.git
$ cd scelvis
$ pip install -e .
$ pip install -r requirements/develop.txt

Afterwards, you can run the visualization web server as follows:

$ scelvis run --data-source path/to/data/dir

Releasing Packages

For the PyPi package:

$ python setup.py sdist
$ twine upload --repository-url https://test.pypi.org/legacy/ dist/scelvis-*.tar.gz
$ twine upload dist/scelvis-*.tar.gz

For the Bioconda package, see the great documentation. The Docker image will automatically be created as a BioContainer when the Bioconda package is built.

History

v0.7.0

added conversion from .loom files
cell filtering also supports downsampling
added PBMC dataset hosted on figshare
added demo movie

v0.6.0

cell filtering
differential expression

v0.5.0

upgrades to Dash v1
fixes to UI, upload and conversion
avoid creation of dense matrices

v0.4.1

Fixing bug with specifying single .h5ad file as data source.
Adding Dockerfile for building Docker images from intermediate versions.

v0.4.0

Adding support for HTTP(S) data sources.
Embedding about.md information in Anndata file.
Adding support for passing

v0.3.0

Adding example data set.
Adding nice introduction to start page.
Adding functionality for creating simple fake data set.
Making import of ruamel_yaml more robust still.
Adding tests.
Adding Travis CI–based continuous integration tests.

v0.2.1

Fixing SFTP support.
Fixing import of ruamel_yaml.

v0.2.0

More refactorization.
Fixing dependency on ruamel-yaml to ruamel.yaml.
Adding conversion feature.
Adding upload feature.
Adding support to load from SSHFS, FTP through pyfilesystem (no FTPS support).
Adding support to load from iRODS, also works via tickets (pass ?ticket=TICKET to the query parameters).

v0.1.0

Initial release.

Everything is new!

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.8.9

Sep 6, 2022

0.8.8

Sep 2, 2022

0.8.7

Nov 8, 2021

0.8.5

Aug 18, 2021

0.8.4

Apr 2, 2020

0.8.3

Mar 24, 2020

0.8.2

Jan 28, 2020

0.8.1

Jan 9, 2020

0.8.0

Jan 8, 2020

0.7.3

Nov 25, 2019

0.7.2

Nov 11, 2019

0.7.1

Oct 29, 2019

This version

0.7.0

Oct 18, 2019

0.6.0

Oct 14, 2019

0.5.0

Sep 10, 2019

0.4.1

Jul 23, 2019

0.4.0

Jul 16, 2019

0.3.0

Jun 20, 2019

0.2.1

May 20, 2019

0.2.0

May 14, 2019

0.1.0

May 10, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scelvis-0.7.0.tar.gz (1.8 MB view details)

Uploaded Oct 18, 2019 Source

File details

Details for the file scelvis-0.7.0.tar.gz.

File metadata

Download URL: scelvis-0.7.0.tar.gz
Upload date: Oct 18, 2019
Size: 1.8 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4

File hashes

Hashes for scelvis-0.7.0.tar.gz
Algorithm	Hash digest
SHA256	`8f3f9f6a9f5fb459d582c485cca5fd4cd99981a5ee564b1ba93f241daee9507c`
MD5	`1e62e32a203be62f46e8ca2303a27832`
BLAKE2b-256	`71d3298be0f9f1e305c0122c707132e1c16f0d628620c2016803c2922e912da0`

See more details on using hashes here.

scelvis 0.7.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

SCelVis: Easy Single-Cell Visualization

Installation

Tutorial

Preparing Your Data

HDF5 Input

Text Input

Loom Input

CellRanger Input

Visualizing Your Data

Data Sources

Environment Variables

Developer Setup

Releasing Packages

History

v0.7.0

v0.6.0

v0.5.0

v0.4.1

v0.4.0

v0.3.0

v0.2.1

v0.2.0

v0.1.0

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes