Single-cell Cytometry Annotation Network
Project description
Scyan stands for Single-cell Cytometry Annotation Network. Based on biological knowledge prior, it provides a fast cell population annotation without requiring any training label. Scyan is an interpretable model that also corrects batch-effect and can be used for debarcoding, cell sampling, and population discovery.
Documentation
The complete documentation can be found here. It contains installation guidelines, tutorials, a description of the API, etc.
Overview
Scyan is a Bayesian probabilistic model composed of a deep invertible neural network called a normalizing flow (the function $f_{\phi}$). It maps a latent distribution of cell expressions into the empirical distribution of cell expressions. This cell distribution is a mixture of gaussian-like distributions representing the sum of a cell-specific and a population-specific term. Also, interpretability and batch effect correction are based on the model latent space — more details in the article's Methods section.
Getting started
Scyan can be installed on every OS with pip
or poetry
.
On macOS / Linux, python>=3.8,<3.11
is required, while python>=3.8,<3.10
is required on Windows. The preferred Python version is 3.9
.
Install from PyPI (recommended)
pip install scyan
Install locally (if you want to contribute)
Advice (optional): We advise creating a new environment via a package manager (except if you use Poetry, which will automatically create the environment). For instance, you can create a new
conda
environment:conda create --name scyan python=3.9 conda activate scyan
Clone the repository and move to its root:
git clone https://github.com/MICS-Lab/scyan.git
cd scyan
Choose one of the following, depending on your needs (it should take at most a few minutes):
pip install . # pip minimal installation (library only)
pip install -e . # pip installation in editable mode
pip install -e '.[dev,hydra,discovery]' # pip installation with all the extras
poetry install -E 'dev hydra discovery' # poetry installation with all the extras
Basic usage / Demo
import scyan
adata, table = scyan.data.load("aml") # Automatic loading
model = scyan.Scyan(adata, table)
model.fit()
model.predict()
This code should run in approximately 40 seconds (once the dataset is loaded). For more usage demo, read the tutorials or the complete documentation.
Technical description
Scyan is a Python library based on:
- Pytorch, a deep learning framework
- AnnData, a data library that works nicely with single-cell data
- Pytorch Lighning, for model training
- Hydra, for project configuration (optional)
- Weight & Biases, for model monitoring (optional)
Project layout
.github/ # Github CI and templates
config/ # Hydra configuration folder (optional use)
data/ # Data folder containing adata files and csv tables
docs/ # The folder used to build the documentation
scripts/ # Scripts to reproduce the results from the article
tests/ # Folder containing tests
scyan/ # Library source code
data/ # Folder with data-related functions and classes
datasets.py # Load and save datasets
tensors.py # Pytorch data-related classes for training
module/ # Folder containing neural network modules
coupling_layer.py # Coupling layer
distribution.py # Prior distribution (called U in the article)
real_nvp.py # Normalizing Flow
scyan_module # Core module
plot/ # Plots
...
tools/
... # Tools (umap, subclustering, ...)
model.py # Scyan model class
_io.py # Input / output functions
preprocess.py # Preprocessing functions
utils.py # Misc functions
.gitattributes
.gitignore
CONTRIBUTING.md # To read before contributing
LICENSE
mkdocs.yml # The docs configuration file
poetry.lock
pyproject.toml # Dependencies, project metadata, and more
README.md
setup.py # Setup file, see `pyproject.toml`
Cite us
Our paper is not published yet. Meanwhile, you can read our preprint on arXiv.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.