Skip to main content

Machine learning library for single-cell data analysis

Project description

Cellarium Logo

Cellarium ML: a machine learning framework for single-cell biology

Cellarium ML is a PyTorch Lightning-based library for distributed single-cell data analysis. It provides tools for training deep learning models on large-scale single-cell datasets, including distributed data loading, model training, and evaluation. Designed to be modular and extensible, Cellarium ML allows users to easily define custom models, data transformations, and training pipelines.


Code Organization

The code is organized as follows:

cellarium/
└── ml/
    ├── "callbacks"        # Custom PyTorch Lightning callbacks
    ├── "core"             # Essential components
    │   ├── "CellariumModule"              # PyTorch Lightning Module for model, training step, and optimizer
    │   ├── "CellariumAnnDataDataModule"   # DataModule for multi-GPU DataLoader for AnnData objects
    │   └── "CellariumPipeline"            # Pipeline for data transformations and model inference
    ├── "data"             # Distributed AnnData Collection and multi-GPU Iterable Datasets
    ├── "lr_schedulers"    # Custom learning rate schedulers
    ├── "models"           # Cellarium ML models
    ├── "preprocessing"    # Pre-processing functions
    ├── "transforms"       # Data transformation modules
    ├── "utilities"        # Utility functions for various submodules
    └── "cli.py"           # Implements the "cellarium-ml" CLI. Models must be registered here

Important Notes

cellarium/ml/models/*

  • Models must subclass CellariumModel and implement the following:

  • reset_parameters: Initializes model parameters.

  • forward: Returns a dictionary containing the computed loss under the loss key.

Optional hooks for training include:

  • on_train_start: Called at the start of training.

  • on_train_epoch_end: Triggered at the end of each epoch.

  • on_train_batch_end: Triggered at the end of each batch.

cellarium/ml/transforms/*

  • All transforms must subclass torch.nn.Module.

  • The forward method must output a dictionary where keys correspond to the input arguments for subsequent transforms or the model.

cellarium/ml/cli.py

  • Models must be registered here to be accessible via the command-line interface (cellarium-ml CLI).


Installation

To install via pip:

pip install cellarium-ml

To install the developer version from source:

git clone https://github.com/cellarium-ai/cellarium-ml.git
cd cellarium-ml
make install  # runs pip install -e .[dev]

API Documentation and Tutorials

For detailed API documentation and tutorials, visit: Cellarium ML Documentation


For Developers

To run the tests:

make test-examples                   # runs single-device cli example tests
make test-dataloader                 # runs single-device dataloader related tests
TEST_DEVICES=2 make test-dataloader  # runs multi-device dataloader related test
make test                            # runs single-device (all other) tests
TEST_DEVICES=2 make test             # runs multi-device (all other) tests

To format the code automatically:

make format                # runs ruff formatter and fixes linter errors

To run the linters:

make lint                  # runs ruff linter and checks for formatter errors

To build the documentation:

make docs                  # builds the documentation at docs/build/html

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cellarium_ml-0.0.8.tar.gz (110.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cellarium_ml-0.0.8-py3-none-any.whl (126.8 kB view details)

Uploaded Python 3

File details

Details for the file cellarium_ml-0.0.8.tar.gz.

File metadata

  • Download URL: cellarium_ml-0.0.8.tar.gz
  • Upload date:
  • Size: 110.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for cellarium_ml-0.0.8.tar.gz
Algorithm Hash digest
SHA256 3ad2b68f772af0910211b58650543f05468aa0e243462561f15574bf457f8115
MD5 bec757b9f4d1097f9df6c920c7934247
BLAKE2b-256 57cbb3cf14bd61c4d8cfe2d6f95b3f7a034278c42f6c9e8f9c692c3b41166550

See more details on using hashes here.

Provenance

The following attestation bundles were made for cellarium_ml-0.0.8.tar.gz:

Publisher: pypi.yml on cellarium-ai/cellarium-ml

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file cellarium_ml-0.0.8-py3-none-any.whl.

File metadata

  • Download URL: cellarium_ml-0.0.8-py3-none-any.whl
  • Upload date:
  • Size: 126.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for cellarium_ml-0.0.8-py3-none-any.whl
Algorithm Hash digest
SHA256 d7ddb813f8a2391a0ad5e2ce3debb058d2964d50f525edbc33df3960254a6ae3
MD5 70910a2a3113614450816307eb2c2526
BLAKE2b-256 84242da141391432fda9dd6e1dc26e119d78a303490471433a060c6061873f40

See more details on using hashes here.

Provenance

The following attestation bundles were made for cellarium_ml-0.0.8-py3-none-any.whl:

Publisher: pypi.yml on cellarium-ai/cellarium-ml

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page