Wasserstein Auto-Encoder for expression reconstruction
Project description
DISCERN
DISCERN is a deep learning approach to reconstruction expression information of single-cell RNAseq data sets using a high quality reference.
Getting Started
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. An interactive tutorial can be found in Tutorial.ipynb.
Prerequisites
We use poetry for dependency management. You can get poetry by
pip install poetry
or (the officially recommended way)
curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python
Installing
To get discern you can clone the repository by
git clone https://github.com/imsb-uke/discern.git
poetry can be used to install all further dependencies in an virtual environment.
cd discern
poetry install --no-dev
To finally run discern you can also directly use poetry with
poetry run commands
or spawn a new shell in the virtual environment
poetry shell
For further examples the first approach is presented.
Using discern
You can use the main function of discern for most use cases. Usually you have to preprocess your data by:
poetry run discern process <parameters.json>
An example parameters.json is provided together with an hyperparameter_search.json for hyperparameter optimization using ray[tune]. The training can be done with
poetry run discern train <parameters.json>
Hyperparameter optimization needs a ray server with can be started with
poetry run ray start --head --port 57780 --redis-password='password'
and can started with
poetry run discern optimize <parameters.json>
For projection 2 different modes are available: Eval mode, which is a more general approach and can save a lot of files:
poetry run discern project --all_batches <parameters.json>
Or projection mode which offers a more fine grained controll to which is projected.
poetry run discern project --metadata="metadatacolumn:value" --metadata="metadatacolumn:" <parameters.json>
which creates to files, one is projected to the average batch calculated by a metadatacolumn and a contained value. The second file is projected to the the average for each value in “metadatacolumn”; individually.
DISCERN also supports online training. You can add new batches to your dataset after the usual train with:
poetry run discern onlinetraining --freeze --filename=<new_not_preprocessed_batch[es].h5ad> <parameters.json>
The data gets automatically preprocessed and added to the dataset. You can run project afterwards as usual (without the --filename flag). --freeze is important to freeze non-conditional layers in training.
Testing
For critical parts of the model several tests has been implemented. They can be run with:
poetry run pytest --cov=discern --cov-report=term
(Requires the development version of discern).
Some tests are slow and don’t run by default, but you can run them using:
poetry run pytest --runslow --cov=discern --cov-report=term
Coding style
To enforce code style guidlines pylint and mypy are use. Example commands are shown below:
poetry run pylint discern ray_hyperpara.py
poetry run mypy discern ray_hyperpara.py
For automatic code formatting yapf was used:
yapf -i <filename.py>
These tools are included in the dev-dependencies.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for discern-reconstruction-0.1.1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 10d6261077f8ac37bb8fc2cc645d15f61b9e04fe1ec49e3f2bfeca05443bb687 |
|
MD5 | 19f7d99ffd81c14942233e95e7030184 |
|
BLAKE2b-256 | eb78a555a87103f7fc1977f4ce2c8a48c9f02e61a210776a13b3a15fefe08ec9 |
Hashes for discern_reconstruction-0.1.1-cp36-cp36m-manylinux_2_34_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5db5dcf7715717dd472b3b44d158b75db7a79d100d793259ffc55549023870a5 |
|
MD5 | 769a2fd5abdba9a6215b94d8a1e53c45 |
|
BLAKE2b-256 | e7324b8f9cdced84ebbcfd9c8a3b746f9097c973d2f3f2a9278138398752d308 |