Skip to main content

Cellxgene Gateway

Project description

Overview

Cellxgene Gateway allows you to use the Cellxgene Server provided by the Chan Zuckerberg Institute (https://github.com/chanzuckerberg/cellxgene) with multiple datasets. It displays an index of available h5ad (anndata) files. When a user clicks on a file name, it launches a Cellxgene Server instance that loads that particular data file and once it is available proxies requests to that server.

codecov PyPI PyPI - Downloads

Running locally

Prequisites

  1. This project requires python 3.6 or higher. Please check your version with
$ python --version
  1. It is also a good idea to set up a venv
python -m venv .cellxgene-gateway
source .cellxgene-gateway/bin/activate # type `deactivate` to deactivate the venv

Install cellxgene-gateway

Option 1: Pip Install from Github

pip install git+https://github.com/Novartis/cellxgene-gateway

Note: you may need to downgrade h5py with pip install h5py==2.9.0 due to an issue in a dependency.

Option 2: Install from PyPI

pip install cellxgene-gateway

Running cellxgene gateway

  1. Prepare a folder with .h5ad files, for example
mkdir ../cellxgene_data
wget https://raw.githubusercontent.com/chanzuckerberg/cellxgene/master/example-dataset/pbmc3k.h5ad -O ../cellxgene_data/pbmc3k.h5ad
  1. Set your environment variables correctly:
export CELLXGENE_DATA=../cellxgene_data  # change this directory if you put data in a different place.
export CELLXGENE_LOCATION=`which cellxgene`
  1. Now, execute the cellxgene gateway:
cellxgene-gateway

Here's what the environment variables mean:

  • CELLXGENE_LOCATION - the location of the cellxgene executable, e.g. ~/anaconda2/envs/cellxgene/bin/cellxgene

At least one of the following is required:

  • CELLXGENE_DATA - a directory that can contain subdirectories with .h5ad data files, without trailing slash, e.g. /mnt/cellxgene_data
  • CELLXGENE_BUCKET - an s3 bucket that can contain keys with .h5ad data files, e.g. my-cellxgene-data-bucket Cellxgene Gateway is designed to make it easy to add additional data sources, please see the source code for gateway.py and the ItemSource interface in items/item_source.py

Optional environment variables:

  • CELLXGENE_ARGS - catch-all variable that can be used to pass additional command line args to cellxgene server
  • EXTERNAL_HOST - the hostname and port from the perspective of the web browser, typically localhost:5005 if running locally. Defaults to "localhost:{GATEWAY_PORT}"
  • EXTERNAL_PROTOCOL - typically http when running locally, can be https when deployed if the gateway is behind a load balancer or reverse proxy that performs https termination. Default value "http"
  • GATEWAY_IP - ip addess of instance gateway is running on, mostly used to display SSH instructions. Defaults to socket.gethostbyname(socket.gethostname())
  • GATEWAY_PORT - local port that the gateway should bind to, defaults to 5005
  • GATEWAY_EXPIRE_SECONDS - time in seconds that a cellxgene process will remain idle before being terminated. Defaults to 3600 (one hour)
  • GATEWAY_EXTRA_SCRIPTS - JSON array of script paths, will be embedded into each page and forwarded with --scripts to cellxgene server
  • GATEWAY_ENABLE_ANNOTATIONS - Set to true or to 1 to enable cellxgene annotations and gene sets.
  • GATEWAY_ENABLE_BACKED_MODE - Set to true or to 1 to load AnnData in file-backed mode. This saves memory and speeds up launch time but may reduce overall performance.
  • GATEWAY_LOG_LEVEL - default is INFO. set to DEBUG to increase logging and to WARNING to decrease logging.
  • S3_ENABLE_LISTINGS_CACHE - Set to true or to 1 to cache listings of S3 folders for performance. If the cache becomes stale, set filecrawl.html?refresh=true query parameter to refresh the cache.

If any of the following optional variables are set, ProxyFix will be used.

  • PROXY_FIX_FOR - Number of upstream proxies setting X-Forwarded-For
  • PROXY_FIX_PROTO - Number of upstream proxies setting X-Forwarded-Proto
  • PROXY_FIX_HOST - Number of upstream proxies setting X-Forwarded-Host
  • PROXY_FIX_PORT - Number of upstream proxies setting X-Forwarded-Port
  • PROXY_FIX_PREFIX - Number of upstream proxies setting X-Forwarded-Prefix

The defaults should be fine if you set up a venv and cellxgene_data folder as above.

Running cellxgene-gateway with Docker

First, build Docker image:

docker build -t cellxgene-gateway .

Then, cellxgene-gateway can be launched as such:

docker run -it --rm \
-v <local_data_dir>:/cellxgene-data \
-p 5005:5005 \
cellxgene-gateway

Additional environment variables can be provided with the -e parameter:

docker run -it --rm \
-v ../cellxgene_data:/cellxgene-data \
-e GATEWAY_PORT=8080 \
-p 8080:8080 \
cellxgene-gateway

Customization

The current paradigm for customization is to modify files during a build or deployment phase:

  • To modify CSS or JS on particular gateway pages, overwrite or append to the templates
  • To add script tags such as for user analytics to all pages, set GATEWAY_EXTRA_SCRIPTS

Currently we use a bash script that copies the gateway to a "build" directory before modifying templates with sed and the like. There is probably a better way.

Development

We’re actively developing. Please see the "future work" section of the wiki. If you’re interested in being a contributor please reach out to @alokito.

Developer Install

If you want to develop the code, you will need to clone the repo. Make sure you have the prequesite listed above, then:

  1. Clone the repo
    git clone https://github.com/Novartis/cellxgene-gateway.git
    cd cellxgene-gateway
  1. Install requirements with
pip install -r requirements.txt
  1. Install the gateway in developer mode
python setup.py develop

For convenience, the code repo includes a run.sh.example shell script to run the gateway.

  1. Install pre-commit hooks
conda install -c conda-forge pre-commit
pre-commit install

Running Tests

Build Status

    python -m unittest discover tests

Code Coverage

    coverage run -m unittest discover tests
    coverage html

Running Linters

pip install isort flake8 black

isort -rc . # rc means recursive, and was deprecated in dev version of isort
black .

Getting Help

If you need help for any reason, please make a github ticket. One of the contributors should help you out.

Releasing New Versions

How to prepare for release

  • Update Changelog.md and version number in init.py
  • Cut a release on github
    • Go to your project homepage on GitHub
    • On right side, you will see Releases link. Click on it.
    • Click on Draft a new release
    • Fill in all the details
      • Tag version should be the version number of your package release
      • Release Title can be anything you want, but we use v0.3.11 (the same as the tag to be created on publish)
      • Description should be changelog
    • Click Publish release at the bottom of the page
    • Now under Releases you can view all of your releases.
    • Copy the download link (tar.gz) and save it somewhere

How to publish to PyPI

Make sure your .pypirc is set up for testpypi and pypi index servers.

rm -rf dist
python setup.py sdist bdist_wheel
python -m twine upload --repository testpypi dist/*
python -m twine upload dist/*

Contributors

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cellxgene-gateway-0.4.0.tar.gz (44.5 kB view hashes)

Uploaded Source

Built Distribution

cellxgene_gateway-0.4.0-py3-none-any.whl (63.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page